openova/platform/nemo-guardrails
e3mrah c3c9c0cf27
feat(charts): bp-vllm + bp-bge + bp-nemo-guardrails wrapper charts (#283)
Catalyst-authored umbrella charts for the W2.5.D AI-inference stack.
None of the three upstream projects publish a Helm chart, so each
chart hand-wires the upstream container as Deployment + Service +
ConfigMap + ServiceMonitor + NetworkPolicy + HPA, with the
sigstore/common library subchart declared to satisfy the
hollow-chart gate (issue #181).

bp-vllm (slot 39) — wraps vllm/vllm-openai:v0.6.4. GPU-aware
(nvidia.com/gpu when vllm.gpu.enabled=true; CPU fallback for dev).
Default model meta-llama/Llama-3.1-8B-Instruct, port 8000,
OpenAI-compatible /v1/chat/completions. All engine knobs
(maxModelLen, gpuMemoryUtilization, dtype, quantization,
tensorParallelSize, prefix-caching) overlay-tunable. Closes #266.

bp-bge (slot 42) — wraps ghcr.io/huggingface/text-embeddings-inference:cpu-1.5.
Default model BAAI/bge-small-en-v1.5 + BAAI/bge-reranker-base
sidecar in same Pod. Two-port Service (8080 embed, 8081 rerank)
annotated for bp-llm-gateway discovery. CPU-friendly defaults;
overlay swaps in BAAI/bge-m3 on GPU Sovereigns. Closes #269.

bp-nemo-guardrails (slot 43) — wraps the upstream NVIDIA/NeMo-Guardrails
Dockerfile (nemoguardrails server, FastAPI, port 8000). LLM endpoint
+ model + engine all overlay-tunable; Colang flow bundle mounts via
configMap.externalName for production rails. ConfigMap stub renders
a default rail for smoke testing. Closes #270.

All three charts:
- Default observability toggles to false per BLUEPRINT-AUTHORING.md §11.2
- Pin upstream image tags (no :latest) per INVIOLABLE-PRINCIPLES.md #4
- Non-root securityContext (runAsUser 1000, drop ALL capabilities)
- prometheus.io scrape annotations on the Pod for fallback discovery
- Operator-tunable NetworkPolicy gating ingress to bp-llm-gateway and
  egress to HuggingFace / bp-vllm / bp-bge as appropriate

helm template (default values) per chart:
  bp-vllm:            ConfigMap, Deployment, Service, ServiceAccount
  bp-bge:             ConfigMap, Deployment, Service, ServiceAccount
  bp-nemo-guardrails: ConfigMap, Deployment, Service, ServiceAccount

helm template (--set serviceMonitor.enabled=true networkPolicy.enabled=true hpa.enabled=true):
  All three render ConfigMap + Deployment + Service + ServiceAccount +
  ServiceMonitor + NetworkPolicy + HorizontalPodAutoscaler.

helm lint: 0 chart(s) failed for all three (single INFO on missing icon —
icons land with the marketplace card work).

Closes #266
Closes #269
Closes #270

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 18:37:07 +04:00
..
chart feat(charts): bp-vllm + bp-bge + bp-nemo-guardrails wrapper charts (#283) 2026-04-30 18:37:07 +04:00
blueprint.yaml feat(charts): bp-vllm + bp-bge + bp-nemo-guardrails wrapper charts (#283) 2026-04-30 18:37:07 +04:00
README.md docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints 2026-04-27 21:47:45 +02:00

NeMo Guardrails

AI safety firewall for LLM deployments. Application Blueprint (see docs/PLATFORM-TECH-STACK.md §4.7 — AI safety). Sits between user input and LLM in bp-cortex to block prompt injection, PII leakage, off-topic content, and hallucinated citations.

Category: AI Safety | Type: Application Blueprint


Overview

NeMo Guardrails provides programmable safety rails for LLM interactions, including prompt injection detection, PII filtering, hallucination detection, and topic control. Non-negotiable for regulated environments deploying AI.

Key Features

  • Prompt injection detection and blocking
  • PII filtering (input and output)
  • Hallucination detection via fact-checking rails
  • Topic boundary enforcement
  • Custom rail definitions (Colang)

Integration

Component Integration
KServe Deployed as pre/post-processing step
LLM Gateway Inline filtering for all LLM requests
LangFuse Traces guardrail activations
Grafana Guardrail metrics and alerting

Used By

  • OpenOva Cortex - AI safety for enterprise LLM deployments

Deployment

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: nemo-guardrails
  namespace: flux-system
spec:
  interval: 10m
  path: ./platform/nemo-guardrails
  prune: true

Part of OpenOva