History

hatiyildiz 9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints All AI/ML component READMEs got banners pointing at PLATFORM-TECH- STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting composition under bp-cortex (composite AI Hub Blueprint): - knative: serverless for KServe-managed inference. - kserve: K8s-native model serving for vLLM, BGE, custom. - vllm: default LLM inference runtime. - milvus: vector database for RAG retrieval. - neo4j: knowledge-graph-augmented retrieval alongside Milvus. - librechat: default chat surface, fronts LLM Gateway via Guardrails. - bge: embedding generation + reranking. - llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon). - anthropic-adapter: OpenAI-SDK → Anthropic translation. - nemo-guardrails: AI safety firewall. - langfuse: LLM observability (latency, tokens, cost, eval). All 11 are explicitly Application Blueprints — NOT Catalyst control plane. Catalyst's own observability stack (Grafana/OTel) covers infrastructure; LangFuse covers AI-specific dimensions (prompt/response/eval). VALIDATION-LOG: Pass 12 entry added. Refs #37	2026-04-27 21:47:45 +02:00
..
README.md	docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints	2026-04-27 21:47:45 +02:00

hatiyildiz 9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints

All AI/ML component READMEs got banners pointing at PLATFORM-TECH-
STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting
composition under bp-cortex (composite AI Hub Blueprint):

- knative: serverless for KServe-managed inference.
- kserve: K8s-native model serving for vLLM, BGE, custom.
- vllm: default LLM inference runtime.
- milvus: vector database for RAG retrieval.
- neo4j: knowledge-graph-augmented retrieval alongside Milvus.
- librechat: default chat surface, fronts LLM Gateway via Guardrails.
- bge: embedding generation + reranking.
- llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon).
- anthropic-adapter: OpenAI-SDK → Anthropic translation.
- nemo-guardrails: AI safety firewall.
- langfuse: LLM observability (latency, tokens, cost, eval).

All 11 are explicitly Application Blueprints — NOT Catalyst control
plane. Catalyst's own observability stack (Grafana/OTel) covers
infrastructure; LangFuse covers AI-specific dimensions
(prompt/response/eval).

VALIDATION-LOG: Pass 12 entry added.

Refs #37

2026-04-27 21:47:45 +02:00

README.md

docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints

2026-04-27 21:47:45 +02:00

README.md

LangFuse

LLM observability and analytics. Application Blueprint (see docs/PLATFORM-TECH-STACK.md §4.7). Traces every LLM call in bp-cortex — latency, tokens, cost, eval scores. Catalyst's general-purpose observability stack (Grafana/OTel) covers infrastructure; LangFuse covers the AI-specific dimensions (prompt/response, model drift, eval).

Category: AI Observability | Type: Application Blueprint

Overview

LangFuse provides tracing, evaluation, and analytics for LLM applications. It captures every LLM call with cost, latency, token usage, and evaluation scores. Complements Grafana (which handles infrastructure metrics) with AI-specific observability.

Key Features

LLM call tracing (input, output, cost, latency, tokens)
Prompt management and versioning
Evaluation scoring and datasets
User analytics and session tracking
Cost attribution per model/user/feature

Integration

Component	Integration
LLM Gateway	Automatic trace capture
Grafana	Infrastructure metrics complement
CNPG	PostgreSQL backend for traces
NeMo Guardrails	Traces guardrail activations

Used By

OpenOva Cortex - LLM observability for enterprise AI

Deployment

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: langfuse
  namespace: flux-system
spec:
  interval: 10m
  path: ./platform/langfuse
  prune: true

Part of OpenOva