Platform Technology Stack
Technology stack for the OpenOva Kubernetes platform.
Status: Accepted | Updated: 2026-02-26
Overview
Components are categorized as Mandatory (always installed), A La Carte (optional services), and Products (vertical solutions bundling components with custom services).
Total: 52 platform components (26 mandatory + 26 a la carte)
Architecture Overview
flowchart TB
subgraph External["External Services"]
DNS[DNS Provider]
Archival[Archival S3]
end
subgraph Region1["Region A (rtz cluster)"]
subgraph K8s1["Kubernetes Cluster"]
GW1[Gateway API]
Apps1[Applications]
Data1[Data Services]
end
Bao1[OpenBao]
Harbor1[Harbor]
MinIO1[MinIO]
Gitea1[Gitea]
end
subgraph Region2["Region B (rtz cluster)"]
subgraph K8s2["Kubernetes Cluster"]
GW2[Gateway API]
Apps2[Applications]
Data2[Data Services]
end
Bao2[OpenBao]
Harbor2[Harbor]
MinIO2[MinIO]
Gitea2[Gitea]
end
DNS --> GW1
DNS --> GW2
Harbor1 <-->|"Replicate"| Harbor2
MinIO1 -->|"Tier to"| Archival
Bao1 <-->|"PushSecrets"| Bao2
Gitea1 <-->|"Bidirectional Mirror"| Gitea2
Mandatory Components (26)
Infrastructure & Provisioning
OpenTofu to Crossplane Handoff
OpenOva uses a two-phase provisioning model where OpenTofu bootstraps the initial infrastructure, then Crossplane takes over for all subsequent operations.
flowchart LR
subgraph Phase1["Phase 1: Bootstrap (OpenTofu)"]
TF[OpenTofu]
VMs[VMs/Nodes]
Net[Network]
K8s[K8s Cluster]
end
subgraph Phase2["Phase 2: Day-2+ (Crossplane)"]
CP[Crossplane]
XR[Compositions]
Cloud[Cloud Resources]
end
subgraph Deleted["After Bootstrap"]
TFState[OpenTofu State]
end
TF --> VMs
TF --> Net
TF --> K8s
K8s --> CP
CP --> XR
XR --> Cloud
TF -.->|"Can be deleted"| TFState
Phase 1 - Bootstrap (OpenTofu):
- Provisions initial VMs/nodes
- Creates network infrastructure (VPC, subnets, firewall rules)
- Installs K3s cluster
- Installs Flux, which then installs all platform components including Crossplane
- OpenTofu's job ends here - state can be archived or deleted
Phase 2 - Day-2 Operations (Crossplane):
- All subsequent cloud resources managed via Kubernetes CRDs
- Continuous reconciliation (drift detection and correction)
- GitOps-native (resources defined in Git, applied by Flux)
- Self-service via Catalyst IDP templates
Why This Model:
| Aspect |
OpenTofu |
Crossplane |
| When |
One-time bootstrap |
Ongoing operations |
| State |
External file (risk) |
Kubernetes CRDs (native) |
| Drift |
Manual detection |
Continuous reconciliation |
| Access |
CI/CD credentials |
Kubernetes RBAC |
| Self-service |
Requires pipeline |
Native via CRDs |
Key Principle: The bootstrap wizard (OpenTofu) is designed to be safely deletable after initial provisioning. Crossplane owns all cloud resources going forward, making the platform self-sustaining without external IaC state.
Networking & Service Mesh
GitOps & Git
Security
Supply Chain Security
Policy
| Component |
Purpose |
Location |
| Kyverno |
Policy engine (validation, mutation, generation) |
platform/kyverno |
Scaling
Operations
| Component |
Purpose |
Location |
| Reloader |
Auto-restart on ConfigMap/Secret changes |
platform/reloader |
Observability
Registry
| Component |
Purpose |
Location |
| Harbor |
Container/artifact registry |
platform/harbor |
Storage
Failover & Resilience
SIEM/SOAR Architecture
flowchart LR
subgraph Detection["Detection"]
Falco[Falco eBPF]
Trivy[Trivy Scans]
Kyverno[Kyverno Violations]
end
subgraph Streaming["Event Streaming"]
Kafka[Strimzi/Kafka]
end
subgraph Analytics["SIEM Analytics"]
OS[OpenSearch Hot]
CH[ClickHouse Cold]
end
subgraph Response["SOAR"]
Specter[OpenOva Specter]
end
Falco -->|Falcosidekick| Kafka
Trivy --> Kafka
Kyverno --> Kafka
Kafka --> OS
OS -->|Age-out| CH
OS --> Specter
Specter -->|Auto-remediate| Detection
Falco detects runtime threats via eBPF. Events flow through Kafka to OpenSearch (hot SIEM) for correlation and alerting. Aged data moves to ClickHouse for cold storage and compliance reporting. OpenOva Specter provides SOAR capabilities for automated incident response.
User Choice Options
Cloud Provider
| Provider |
Status |
Crossplane Provider |
| Hetzner Cloud |
Available |
hcloud |
| Huawei Cloud |
Coming |
huaweicloud |
| Oracle Cloud |
Coming |
oci |
| AWS |
Coming |
aws |
| GCP |
Coming |
gcp |
| Azure |
Coming |
azure |
Regions
| Option |
Description |
| 1 region |
Allowed — single rtz cluster, no geographic redundancy |
| 2 regions |
Recommended — two symmetric rtz clusters, k8gb routes between them |
LoadBalancer
| Option |
How It Works |
Cost |
| Cloud Provider LB |
Native LB |
~EUR5-10/mo |
| k8gb DNS-based LB |
Gateway API + k8gb |
Free |
| Cilium L2 Mode |
ARP-based (same subnet) |
Free |
DNS Provider
| Provider |
Availability |
| Cloudflare |
Always |
| Hetzner DNS |
If Hetzner chosen |
| AWS Route53 |
If AWS chosen |
| GCP Cloud DNS |
If GCP chosen |
| Azure DNS |
If Azure chosen |
Archival S3 Storage
| Provider |
Availability |
| Cloudflare R2 |
Always (zero egress) |
| AWS S3 |
If AWS chosen |
| GCP GCS |
If GCP chosen |
| Azure Blob |
If Azure chosen |
A La Carte Data Services (26 components)
A La Carte Communication
A La Carte Workflow & Processing
A La Carte Analytics
| Component |
Purpose |
Location |
| Iceberg |
Open table format (data lakehouse) |
platform/iceberg |
A La Carte AI/ML
A La Carte AI Safety & Observability
A La Carte Identity & Monetization
A La Carte Chaos Engineering
| Component |
Purpose |
Location |
| Litmus Chaos |
Chaos engineering experiments |
platform/litmus |
Products
Products bundle a la carte components with custom services for specific verticals.
Cortex (OpenOva Cortex - AI Hub)
Enterprise AI platform with LLM serving, RAG, AI safety, and LLM observability.
flowchart TB
subgraph UI["User Interfaces"]
LibreChat[LibreChat]
ClaudeCode[Claude Code]
end
subgraph Safety["AI Safety"]
Guardrails[NeMo Guardrails]
end
subgraph Gateway["Gateway Layer"]
LLMGateway[LLM Gateway]
Adapter[Anthropic Adapter]
end
subgraph Serving["Model Serving"]
Knative[Knative]
KServe[KServe]
vLLM[vLLM]
end
subgraph Knowledge["Knowledge Layer"]
Milvus[Milvus Vectors]
Neo4j[Neo4j Graph]
end
subgraph Embeddings["Embeddings"]
BGE[BGE-M3 + Reranker]
end
subgraph Observability["AI Observability"]
LangFuse[LangFuse]
end
UI --> Safety
Safety --> Gateway
Gateway --> Serving
Serving --> Knowledge
Serving --> Embeddings
Gateway --> Observability
Cortex Components
Cortex Resource Requirements
| Component |
Replicas |
CPU |
Memory |
GPU |
| vLLM |
1 |
4 |
32Gi |
2x A10 |
| BGE-M3 |
1 |
2 |
4Gi |
1x A10 |
| BGE-Reranker |
1 |
1 |
2Gi |
1x A10 |
| Milvus |
3 |
2 |
8Gi |
- |
| Neo4j |
1 |
2 |
4Gi |
- |
| LibreChat |
2 |
0.5 |
1Gi |
- |
| LLM Gateway |
2 |
0.25 |
512Mi |
- |
| NeMo Guardrails |
2 |
1 |
2Gi |
- |
| LangFuse |
2 |
0.5 |
1Gi |
- |
| Total |
- |
~16 |
~56Gi |
4x A10 |
Fingate (OpenOva Fingate - Open Banking)
Fintech sandbox with PSD2/FAPI compliance.
flowchart LR
subgraph Gateway["API Gateway"]
Envoy[Envoy via Cilium]
ExtAuth[ext_authz]
end
subgraph Auth["Authorization"]
Keycloak[Keycloak FAPI]
end
subgraph Metering["Metering"]
OpenMeter[OpenMeter]
Valkey[Valkey Quota]
end
subgraph APIs["Open Banking APIs"]
AISP[AISP]
PISP[PISP]
TPP[TPP Management]
end
Envoy --> ExtAuth
ExtAuth --> Keycloak
ExtAuth --> Valkey
Valkey --> OpenMeter
Keycloak --> APIs
Fingate Components
Fabric (OpenOva Fabric - Data & Integration)
Event-driven data integration and lakehouse analytics (merged from former Titan + Fuse products).
Fabric Components
Relay (OpenOva Relay - Communication)
Enterprise communication platform with email, video, chat, and WebRTC.
Relay Components
Cluster Deployment
K3s Installation
curl -sfL https://get.k3s.io | sh -s - server \
--cluster-init \
--disable traefik \
--disable servicelb \
--disable local-storage \
--flannel-backend=none \
--disable-network-policy \
--kube-controller-manager-arg="node-monitor-period=5s" \
--kube-controller-manager-arg="node-monitor-grace-period=20s" \
--kube-apiserver-arg="default-watch-cache-size=50" \
--etcd-arg="quota-backend-bytes=1073741824" \
--kubelet-arg="max-pods=50"
Disabled K3s Components
| Component |
Replacement |
| traefik |
Gateway API (Cilium) |
| servicelb |
DNS-based failover (k8gb) |
| local-storage |
Application-level replication |
| flannel |
Cilium CNI |
Cilium Installation
helm install cilium cilium/cilium \
--namespace kube-system \
--set kubeProxyReplacement=true \
--set k8sServiceHost=${API_SERVER_IP} \
--set k8sServicePort=6443 \
--set hubble.enabled=true \
--set hubble.relay.enabled=true \
--set encryption.enabled=true \
--set encryption.type=wireguard \
--set gatewayAPI.enabled=true \
--set envoy.enabled=true
Resource Estimates
Core Platform (Per Region)
| Category |
Components |
Estimated RAM |
| Core Platform |
Cilium, Flux, ESO, Kyverno |
~2GB |
| Observability |
Grafana Stack + Alloy |
~3GB |
| Storage |
Harbor, MinIO, Velero |
~4GB |
| Security |
OpenBao, cert-manager, Trivy, Falco, Sigstore, Coraza |
~1.5GB |
| Git |
Gitea |
~1GB |
| Operations |
Reloader, Syft/Grype |
~0.5GB |
| Minimum Total |
|
~12GB |
Recommended minimum: 3 nodes x 8GB RAM = 24GB per region
With Cortex (Per Region)
| Category |
Components |
Estimated RAM |
GPU |
| Core Platform |
(as above) |
~12GB |
- |
| Cortex |
LLM Gateway, NeMo Guardrails, LangFuse, etc. |
~56GB |
4x A10 |
| Total |
|
~68GB |
4x A10 |
Recommended: 3 CPU nodes + 2 GPU nodes per region
Multi-Region Data Flow
flowchart TB
subgraph Region1["Region A (rtz cluster)"]
PG1[CNPG Write Node]
FDB1[FerretDB]
SK1[Strimzi/Kafka]
VK1[Valkey Active]
GT1[Gitea]
MV1[Milvus Active]
Bao1R1[OpenBao]
Falco1[Falco]
end
subgraph Region2["Region B (rtz cluster)"]
PG2[CNPG Async Replica]
FDB2[FerretDB]
SK2[Strimzi/Kafka MirrorMaker2]
VK2[Valkey Replica]
GT2[Gitea]
MV2[Milvus Replica]
Bao2R2[OpenBao]
Falco2[Falco]
end
subgraph SIEM["Security"]
OS[OpenSearch SIEM]
end
PG1 -->|"WAL Streaming"| PG2
FDB1 -.->|"Via CNPG WAL"| FDB2
SK1 -->|"MirrorMaker2"| SK2
VK1 -->|"REPLICAOF"| VK2
GT1 <-->|"Bidirectional Mirror"| GT2
MV1 -->|"Collection Sync"| MV2
Bao1R1 <-->|"PushSecrets"| Bao2R2
Falco1 -->|"Falcosidekick"| OS
Falco2 -->|"Falcosidekick"| OS
Part of OpenOva