History

talent-mesh 435f49738d feat: restructure platform to 52 components and 9 products Technology forecast and strategic review restructure: - Remove 13 components (backstage, mongodb, activemq, vitess, airflow, camel, dapr, superset, searxng, langserve, trino, lago, rabbitmq) - Add 10 components (sigstore, syft-grype, nemo-guardrails, langfuse, reloader, matrix, ferretdb, litmus, livekit, coraza) - Rename product: Synapse → Axon (SaaS LLM Gateway) - Merge products: Titan + Fuse → Fabric (Data & Integration) - New product: Relay (Communication) - Replace Backstage with Catalyst IDP - Replace MongoDB with FerretDB (MongoDB wire protocol on CNPG) - Add supply chain security (Sigstore/Cosign, Syft+Grype) - Add AI safety and observability (NeMo Guardrails, LangFuse) - Add technology forecast 2027-2030 document - Full verification pass: zero stale references across all docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 21:00:19 +00:00
..
README.md	feat: restructure platform to 52 components and 9 products	2026-02-26 21:00:19 +00:00

talent-mesh 435f49738d feat: restructure platform to 52 components and 9 products

Technology forecast and strategic review restructure:
- Remove 13 components (backstage, mongodb, activemq, vitess, airflow, camel, dapr, superset, searxng, langserve, trino, lago, rabbitmq)
- Add 10 components (sigstore, syft-grype, nemo-guardrails, langfuse, reloader, matrix, ferretdb, litmus, livekit, coraza)
- Rename product: Synapse → Axon (SaaS LLM Gateway)
- Merge products: Titan + Fuse → Fabric (Data & Integration)
- New product: Relay (Communication)
- Replace Backstage with Catalyst IDP
- Replace MongoDB with FerretDB (MongoDB wire protocol on CNPG)
- Add supply chain security (Sigstore/Cosign, Syft+Grype)
- Add AI safety and observability (NeMo Guardrails, LangFuse)
- Add technology forecast 2027-2030 document
- Full verification pass: zero stale references across all docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-26 21:00:19 +00:00

README.md

feat: restructure platform to 52 components and 9 products

2026-02-26 21:00:19 +00:00

README.md

OpenOva Cortex

Enterprise AI platform with LLM serving, RAG, AI safety, and LLM observability.

Status: Accepted | Updated: 2026-02-26

Overview

OpenOva Cortex is an enterprise AI product that bundles AI/ML infrastructure components with AI safety and observability for enterprise AI deployments.

flowchart TB
    subgraph UI["User Interfaces"]
        LibreChat[LibreChat<br/>Chat UI]
        ClaudeCode[Claude Code]
    end

    subgraph Safety["AI Safety"]
        Guardrails[NeMo Guardrails<br/>Safety Firewall]
    end

    subgraph Gateway["Gateway Layer"]
        LLMGateway[LLM Gateway]
        Adapter[Anthropic Adapter]
    end

    subgraph Serving["Model Serving"]
        KServe[KServe]
        vLLM[vLLM]
    end

    subgraph Knowledge["Knowledge Layer"]
        Milvus[Milvus<br/>Vectors]
        Neo4j[Neo4j<br/>Graph]
    end

    subgraph Embeddings["Embeddings"]
        BGE[BGE-M3]
        Reranker[BGE-Reranker]
    end

    subgraph Observability["AI Observability"]
        LangFuse[LangFuse]
    end

    UI --> Safety
    Safety --> Gateway
    Gateway --> Serving
    Serving --> Knowledge
    Serving --> Embeddings
    Gateway --> Observability

Components

All components are in platform/ (flat structure):

Component	Purpose	Location
llm-gateway	Subscription-based LLM access	platform/llm-gateway
anthropic-adapter	Claude API translation	platform/anthropic-adapter
knative	Serverless platform	platform/knative
kserve	Model serving	platform/kserve
vllm	LLM inference	platform/vllm
milvus	Vector database	platform/milvus
neo4j	Graph database	platform/neo4j
librechat	Chat UI	platform/librechat
bge	Embeddings + reranking	platform/bge
nemo-guardrails	AI safety firewall	platform/nemo-guardrails
langfuse	LLM observability	platform/langfuse

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     User Interfaces                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐                  │
│  │LibreChat │  │Claude    │  │  Custom  │                  │
│  │  (Chat)  │  │  Code    │  │   Apps   │                  │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘                  │
└───────┼─────────────┼─────────────┼─────────────────────────┘
        │             │             │
        ▼             ▼             ▼
┌─────────────────────────────────────────────────────────────┐
│                    AI Safety Layer                           │
│  ┌─────────────────────────────────────────────────────┐    │
│  │           NeMo Guardrails                           │    │
│  │  (Prompt injection, PII filter, topic control)      │    │
│  └──────────────────────┬──────────────────────────────┘    │
└─────────────────────────┼───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    Gateway Layer                            │
│  ┌─────────────────────┐  ┌─────────────────────┐          │
│  │    LLM Gateway      │  │  Anthropic Adapter  │          │
│  │ (Subscription Proxy)│  │  (API Translation)  │          │
│  └──────────┬──────────┘  └──────────┬──────────┘          │
└─────────────┼────────────────────────┼──────────────────────┘
              │                        │
              ▼                        ▼
┌─────────────────────────────────────────────────────────────┐
│                    Model Serving                            │
│  ┌─────────────────────┐  ┌─────────────────────┐          │
│  │       KServe        │  │        vLLM         │          │
│  │   (Orchestration)   │  │     (Inference)     │          │
│  └─────────────────────┘  └─────────────────────┘          │
└─────────────────────────────────────────────────────────────┘
         │              │
         ▼              ▼
┌─────────────────────────────────────────────────────────────┐
│                   Knowledge Layer                           │
│  ┌─────────────────────┐  ┌─────────────────────┐          │
│  │       Milvus        │  │       Neo4j         │          │
│  │   (Vector Store)    │  │   (Graph Store)     │          │
│  └─────────────────────┘  └─────────────────────┘          │
└─────────────────────────────────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────────────────────────┐
│                   Embedding Layer                           │
│  ┌─────────────────────┐  ┌─────────────────────┐          │
│  │       BGE-M3        │  │    BGE-Reranker     │          │
│  │    (Embeddings)     │  │  (Cross-Encoder)    │          │
│  └─────────────────────┘  └─────────────────────┘          │
└─────────────────────────────────────────────────────────────┘

                  LangFuse (traces all LLM calls)

Agent Presets

Agent	Purpose	Retrieval
Deep Thinker	Complex reasoning with CoT	None
Quick Thinker	Fast responses	None
Compliance Advisor	Regulatory knowledge	Vector + Graph
AIOps Advisor	Infrastructure docs	Vector
Dev Advisor	Development standards	Vector
CAD Advisor	Document comparison	Ephemeral Vector

Deployment

Enable Cortex Product

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: ai-hub
  namespace: flux-system
spec:
  interval: 10m
  path: ./ai-hub/deploy
  prune: true
  sourceRef:
    kind: GitRepository
    name: openova-blueprints
  postBuild:
    substitute:
      TENANT: ${TENANT}
      DOMAIN: ${DOMAIN}
      GPU_NODE_POOL: ${GPU_NODE_POOL}

Configuration

Parameter	Description	Default
`TENANT`	Tenant identifier	Required
`DOMAIN`	Base domain	Required
`GPU_NODE_POOL`	GPU node label	Required
`LLM_MODEL`	Default LLM	`qwen3-32b`
`EMBEDDING_MODEL`	Embedding model	`bge-m3`
`VECTOR_DIM`	Vector dimensions	`1024`

Resource Requirements

Component	Replicas	CPU	Memory	GPU
vLLM	1	4	32Gi	2x A10
BGE-M3	1	2	4Gi	1x A10
BGE-Reranker	1	1	2Gi	1x A10
Milvus	3	2	8Gi	-
Neo4j	1	2	4Gi	-
LibreChat	2	0.5	1Gi	-
LLM Gateway	2	0.25	512Mi	-
NeMo Guardrails	2	1	2Gi	-
LangFuse	2	0.5	1Gi	-
Total	-	~16	~56Gi	4x A10

GPU Requirements

GPU Type	Minimum	Recommended
NVIDIA A10	2	4
NVIDIA A100	1	2
NVIDIA H100	1	1

Use Cases

Claude Code with Internal Models

# Configure Claude Code
export ANTHROPIC_BASE_URL="https://llm-gateway.ai-hub.<domain>/v1"
export ANTHROPIC_API_KEY="your-subscription-token"

# Use Claude Code normally
claude "Explain this code..."

RAG-Powered Chat

# Access LibreChat
https://chat.ai-hub.<domain>

# Select agent preset (e.g., Compliance Advisor)
# Upload documents for context
# Ask questions with citations

Monitoring

Key Metrics

Metric	Query
LLM latency	`vllm_request_duration_seconds`
Token throughput	`vllm_generation_tokens_total`
GPU utilization	`DCGM_FI_DEV_GPU_UTIL`
Guardrail blocks	`nemo_guardrails_blocked_total`
LLM cost	via LangFuse dashboard

Grafana Dashboards

Dashboard	Purpose
AI Hub Overview	Request rates, latencies
GPU Metrics	Utilization, memory
RAG Analytics	Retrieval quality, citations
AI Safety	Guardrail activations, blocked prompts
LLM Cost	Per-model, per-user cost tracking (LangFuse)

Operations

Health Checks

# Check all components
kubectl get pods -n ai-hub

# Check vLLM
curl http://vllm.ai-hub.svc:8000/health

# Check Milvus
kubectl exec -it milvus-proxy-0 -n ai-hub -- curl localhost:9091/healthz

Troubleshooting

Issue	Cause	Resolution
OOM on vLLM	Model too large	Increase GPU memory or use quantization
Slow retrieval	Index not optimized	Rebuild Milvus index
Empty responses	No relevant chunks	Check embedding quality
GPU not detected	Driver issue	Verify NVIDIA device plugin
Prompt injection	Guardrails not configured	Review NeMo Guardrails rules

Part of OpenOva