History

hatiyildiz 9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints All AI/ML component READMEs got banners pointing at PLATFORM-TECH- STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting composition under bp-cortex (composite AI Hub Blueprint): - knative: serverless for KServe-managed inference. - kserve: K8s-native model serving for vLLM, BGE, custom. - vllm: default LLM inference runtime. - milvus: vector database for RAG retrieval. - neo4j: knowledge-graph-augmented retrieval alongside Milvus. - librechat: default chat surface, fronts LLM Gateway via Guardrails. - bge: embedding generation + reranking. - llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon). - anthropic-adapter: OpenAI-SDK → Anthropic translation. - nemo-guardrails: AI safety firewall. - langfuse: LLM observability (latency, tokens, cost, eval). All 11 are explicitly Application Blueprints — NOT Catalyst control plane. Catalyst's own observability stack (Grafana/OTel) covers infrastructure; LangFuse covers AI-specific dimensions (prompt/response/eval). VALIDATION-LOG: Pass 12 entry added. Refs #37	2026-04-27 21:47:45 +02:00
..
README.md	docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints	2026-04-27 21:47:45 +02:00

hatiyildiz 9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints

All AI/ML component READMEs got banners pointing at PLATFORM-TECH-
STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting
composition under bp-cortex (composite AI Hub Blueprint):

- knative: serverless for KServe-managed inference.
- kserve: K8s-native model serving for vLLM, BGE, custom.
- vllm: default LLM inference runtime.
- milvus: vector database for RAG retrieval.
- neo4j: knowledge-graph-augmented retrieval alongside Milvus.
- librechat: default chat surface, fronts LLM Gateway via Guardrails.
- bge: embedding generation + reranking.
- llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon).
- anthropic-adapter: OpenAI-SDK → Anthropic translation.
- nemo-guardrails: AI safety firewall.
- langfuse: LLM observability (latency, tokens, cost, eval).

All 11 are explicitly Application Blueprints — NOT Catalyst control
plane. Catalyst's own observability stack (Grafana/OTel) covers
infrastructure; LangFuse covers AI-specific dimensions
(prompt/response/eval).

VALIDATION-LOG: Pass 12 entry added.

Refs #37

2026-04-27 21:47:45 +02:00

README.md

docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints

2026-04-27 21:47:45 +02:00

README.md

NeMo Guardrails

AI safety firewall for LLM deployments. Application Blueprint (see docs/PLATFORM-TECH-STACK.md §4.7 — AI safety). Sits between user input and LLM in bp-cortex to block prompt injection, PII leakage, off-topic content, and hallucinated citations.

Category: AI Safety | Type: Application Blueprint

Overview

NeMo Guardrails provides programmable safety rails for LLM interactions, including prompt injection detection, PII filtering, hallucination detection, and topic control. Non-negotiable for regulated environments deploying AI.

Key Features

Prompt injection detection and blocking
PII filtering (input and output)
Hallucination detection via fact-checking rails
Topic boundary enforcement
Custom rail definitions (Colang)

Integration

Component	Integration
KServe	Deployed as pre/post-processing step
LLM Gateway	Inline filtering for all LLM requests
LangFuse	Traces guardrail activations
Grafana	Guardrail metrics and alerting

Used By

OpenOva Cortex - AI safety for enterprise LLM deployments

Deployment

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: nemo-guardrails
  namespace: flux-system
spec:
  interval: 10m
  path: ./platform/nemo-guardrails
  prune: true

Part of OpenOva