History

hatiyildiz 9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints All AI/ML component READMEs got banners pointing at PLATFORM-TECH- STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting composition under bp-cortex (composite AI Hub Blueprint): - knative: serverless for KServe-managed inference. - kserve: K8s-native model serving for vLLM, BGE, custom. - vllm: default LLM inference runtime. - milvus: vector database for RAG retrieval. - neo4j: knowledge-graph-augmented retrieval alongside Milvus. - librechat: default chat surface, fronts LLM Gateway via Guardrails. - bge: embedding generation + reranking. - llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon). - anthropic-adapter: OpenAI-SDK → Anthropic translation. - nemo-guardrails: AI safety firewall. - langfuse: LLM observability (latency, tokens, cost, eval). All 11 are explicitly Application Blueprints — NOT Catalyst control plane. Catalyst's own observability stack (Grafana/OTel) covers infrastructure; LangFuse covers AI-specific dimensions (prompt/response/eval). VALIDATION-LOG: Pass 12 entry added. Refs #37	2026-04-27 21:47:45 +02:00
..
README.md	docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints	2026-04-27 21:47:45 +02:00

hatiyildiz 9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints

All AI/ML component READMEs got banners pointing at PLATFORM-TECH-
STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting
composition under bp-cortex (composite AI Hub Blueprint):

- knative: serverless for KServe-managed inference.
- kserve: K8s-native model serving for vLLM, BGE, custom.
- vllm: default LLM inference runtime.
- milvus: vector database for RAG retrieval.
- neo4j: knowledge-graph-augmented retrieval alongside Milvus.
- librechat: default chat surface, fronts LLM Gateway via Guardrails.
- bge: embedding generation + reranking.
- llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon).
- anthropic-adapter: OpenAI-SDK → Anthropic translation.
- nemo-guardrails: AI safety firewall.
- langfuse: LLM observability (latency, tokens, cost, eval).

All 11 are explicitly Application Blueprints — NOT Catalyst control
plane. Catalyst's own observability stack (Grafana/OTel) covers
infrastructure; LangFuse covers AI-specific dimensions
(prompt/response/eval).

VALIDATION-LOG: Pass 12 entry added.

Refs #37

2026-04-27 21:47:45 +02:00

README.md

docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints

2026-04-27 21:47:45 +02:00

README.md

LibreChat

Open-source chat UI with multi-model support and file uploads. Application Blueprint (see docs/PLATFORM-TECH-STACK.md §4.6). Default end-user chat surface in bp-cortex — fronts the LLM Gateway and routes through NeMo Guardrails for safety.

Status: Accepted | Updated: 2026-04-27

Overview

LibreChat provides a ChatGPT-like interface supporting multiple AI backends, file uploads, and customizable agent presets.

flowchart LR
    subgraph LibreChat["LibreChat"]
        UI[Chat UI]
        Presets[Agent Presets]
        Files[File Handling]
    end

    subgraph Backends["AI Backends"]
        OpenAI[OpenAI API]
        Custom[Custom Endpoints]
        RAG[RAG Service]
    end

    subgraph Storage["Storage"]
        FerretDB[FerretDB]
        FileStore[File Storage]
    end

    User[User] --> UI
    UI --> Presets
    UI --> Files
    Presets --> Backends
    Files --> FileStore
    UI --> FerretDB

Why LibreChat?

Feature	Benefit
Multi-model	Switch between AI backends
Agent presets	Pre-configured assistants
File uploads	Document analysis
Conversation history	Persistent chat storage
SSO integration	Enterprise authentication

Configuration

Helm Values

librechat:
  replicas: 2

  config:
    endpoints:
      custom:
        - name: "AI Hub"
          apiKey: "${RAG_SERVICE_API_KEY}"
          baseURL: "http://rag-service.ai-hub.svc:8000/v1"
          models:
            default: ["deep-thinker", "quick-thinker", "compliance-advisor"]
          titleModel: "quick-thinker"
          dropParams: ["stop", "user"]

    registration:
      socialLogins: ["openid"]

    fileConfig:
      endpoints:
        custom:
          fileLimit: 10
          fileSizeLimit: 50  # MB
          supportedMimeTypes:
            - "application/pdf"
            - "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
            - "text/plain"

ferretdb:
  enabled: true
  # FerretDB provides MongoDB wire protocol compatibility
  # backed by CNPG PostgreSQL
  auth:
    rootPassword: ""  # From ESO
  persistence:
    size: 10Gi

Agent Presets

Deep Thinker

{
  "name": "Deep Thinker",
  "model": "deep-thinker",
  "description": "Complex reasoning with visible chain-of-thought",
  "systemPrompt": "You are a thoughtful analyst. Think step by step and show your reasoning.",
  "temperature": 0.7,
  "maxTokens": 8192
}

Quick Thinker

{
  "name": "Quick Thinker",
  "model": "quick-thinker",
  "description": "Fast responses for simple queries",
  "systemPrompt": "You are a helpful assistant. Be concise and direct.",
  "temperature": 0.3,
  "maxTokens": 2048
}

Compliance Advisor

{
  "name": "Compliance Advisor",
  "model": "compliance-advisor",
  "description": "Regulatory knowledge with citations",
  "systemPrompt": "You are a compliance expert. Always cite your sources with document references.",
  "temperature": 0.1,
  "maxTokens": 4096
}

SSO Configuration

Azure AD OIDC

socialLogins:
  - openid

openidConfig:
  issuer: "https://login.microsoftonline.com/${TENANT_ID}/v2.0"
  clientId: "${CLIENT_ID}"
  clientSecret: "${CLIENT_SECRET}"
  scope: ["openid", "profile", "email"]
  callbackURL: "https://chat.ai-hub.<domain>/oauth/openid/callback"

Keycloak

openidConfig:
  issuer: "https://keycloak.<domain>/realms/ai-hub"
  clientId: "librechat"
  clientSecret: ""  # From ESO
  scope: ["openid", "profile", "email"]

File Upload Flow

sequenceDiagram
    participant User
    participant LibreChat
    participant RAG as RAG Service
    participant Milvus

    User->>LibreChat: Upload PDF
    LibreChat->>RAG: POST /ingest/file
    RAG->>RAG: Parse & chunk
    RAG->>Milvus: Store vectors (ephemeral)
    RAG-->>LibreChat: file_id

    User->>LibreChat: Ask question about file
    LibreChat->>RAG: Query with file_id context
    RAG->>Milvus: Search ephemeral partition
    RAG-->>LibreChat: Response with citations

Environment Variables

Variable	Purpose
`MONGO_URI`	FerretDB connection string (MongoDB wire protocol)
`OPENID_CLIENT_ID`	SSO client ID
`OPENID_CLIENT_SECRET`	SSO client secret
`CREDS_KEY`	Encryption key for credentials
`CREDS_IV`	Encryption IV
`JWT_SECRET`	JWT signing secret

Custom Endpoints

endpoints:
  custom:
    - name: "RAG Service"
      baseURL: "http://rag-service.ai-hub.svc:8000/v1"
      apiKey: "${API_KEY}"
      models:
        default:
          - deep-thinker
          - quick-thinker
          - compliance-advisor
          - aiops-advisor
          - dev-advisor
          - internet-search

Monitoring

Metric	Description
Active users	Concurrent chat sessions
Message count	Total messages sent
File uploads	Documents processed
Response time	Backend latency

Consequences

Positive:

ChatGPT-like experience
Multi-model switching
File upload support
Enterprise SSO
Customizable presets

Negative:

Requires FerretDB (MongoDB wire protocol on CNPG)
Complex configuration
UI customization limited

Part of OpenOva