Component-level architectural correction (two changes): 1. MinIO → SeaweedFS as unified S3 encapsulation layer The old design used MinIO for in-cluster S3 plus separate cold-tier configuration scattered across consumers. The new design positions SeaweedFS as the single S3 encapsulation layer: every Catalyst component talks to one endpoint (seaweedfs.storage.svc:8333). SeaweedFS internally handles hot tier (in-cluster NVMe), warm tier (in-cluster bulk), and cold tier (transparent passthrough to cloud archival storage — Cloudflare R2 / AWS S3 / Hetzner Object Storage / etc., chosen at Sovereign provisioning). One audit/lifecycle/encryption boundary instead of N. No Catalyst component talks to cloud S3 directly anymore — Velero, CNPG WAL archive, OpenSearch snapshots, Loki/Mimir/Tempo, Iceberg, Harbor blob store, Application buckets all share one S3 surface. 2. Apache Guacamole added as Application Blueprint §4.5 Communication Clientless browser-based RDP/VNC/SSH/kubectl-exec gateway. Keycloak SSO, full session recording to SeaweedFS for compliance evidence (PSD2/DORA/SOX). Composed into bp-relay. Replaces VPN+native-client distribution for auditable remote access. Component changes: - DELETED: platform/minio/ - CREATED: platform/seaweedfs/README.md (unified S3 + cold-tier encapsulation; bucket layout; multi-region replication via shared cold backend; migration-from-MinIO section) - CREATED: platform/guacamole/README.md (clientless remote-desktop gateway; GuacamoleConnection CRD; compliance integration via session recordings) Doc updates: PLATFORM-TECH-STACK §1+§3.5+§4.5+§5+§7.4; TECHNOLOGY-FORECAST L11+mandatory+a-la-carte counts (52 → 53); ARCHITECTURE §3 topology; SECURITY §4 DB engines; SOVEREIGN-PROVISIONING §1 inputs; SRE §2.5+§7; IMPLEMENTATION-STATUS §3; BLUEPRINT-AUTHORING stateful examples; BUSINESS-STRATEGY 13 component-count anchors + Relay product line; README.md backup row; CLAUDE.md folder count. Component README updates (S3 endpoint + dependency renames): cnpg, clickhouse, flink, gitea, iceberg, harbor, grafana, livekit, kserve, milvus, opensearch, flux, stalwart, velero (substantive rewrite of velero — now writes exclusively to SeaweedFS with cold-tier auto-routing). Products: relay, fabric. UI scaffold: products/catalyst/bootstrap/ui/src/shared/constants/components.ts — minio entry replaced with seaweedfs; velero+harbor deps updated; new guacamole entry added. VALIDATION-LOG entry "Pass 104 — MinIO → SeaweedFS swap + Guacamole add" captures the encapsulation principle and adds Lesson #22: storage tier policy belongs at the encapsulation boundary, not inside every consumer. Verification: zero remaining MinIO references in canonical docs (one intentional retention in TECHNOLOGY-FORECAST L37 explaining the swap); 53 platform/ folders matching all "53 components" anchors; bp-relay composition includes guacamole. |
||
|---|---|---|
| .. | ||
| README.md | ||
OpenSearch
Search engine, analytics, and hot SIEM backend. Application Blueprint (see docs/PLATFORM-TECH-STACK.md §4.1) — installed by Organizations that want SIEM, full-text search, or log analytics. Not a Catalyst control-plane component.
Status: Accepted | Updated: 2026-04-27
Overview
OpenSearch is an open-source search and analytics engine forked from Elasticsearch 7.10.2 after Elastic changed its license from Apache 2.0 to the Business Source License (BSL). Licensed under the Apache License 2.0, OpenSearch is maintained by AWS and a growing community, providing full-text search, log analytics, and security analytics (SIEM) capabilities without licensing restrictions.
In the OpenOva platform, OpenSearch serves two distinct roles. First, it provides full-text search capabilities for applications that need search-as-a-service (product search, document indexing, autocomplete). Second, and critically, it serves as the SIEM backend for runtime security events collected by Falco. OpenSearch Dashboards provides the visualization and alerting layer for both use cases.
OpenSearch is NOT a replacement for Loki in the OpenOva observability stack. Loki handles operational log aggregation from all platform components with label-based indexing optimized for cost-efficient storage. OpenSearch handles application-level search and security event correlation (SIEM) where full-text indexing and complex query capabilities are required.
Architecture
Search and SIEM
flowchart TB
subgraph Sources["Data Sources"]
Falco[Falco Runtime Security]
Apps[Application Data]
Audit[K8s Audit Logs]
end
subgraph OpenSearch["OpenSearch Cluster"]
Master1[Master Node 1]
Master2[Master Node 2]
Master3[Master Node 3]
Data1[Data Node 1]
Data2[Data Node 2]
Data3[Data Node 3]
Ingest[Ingest Node]
end
subgraph Visualization
Dashboards[OpenSearch Dashboards]
Alerting[Alerting Plugin]
end
Falco -->|"Falcosidekick"| Ingest
Apps -->|"Index API"| Ingest
Audit -->|"Filebeat"| Ingest
Ingest --> Data1
Ingest --> Data2
Ingest --> Data3
Data1 --> Dashboards
Dashboards --> Alerting
SIEM Pipeline
flowchart LR
subgraph Detection["Runtime Detection"]
Falco[Falco eBPF]
end
subgraph Routing["Event Routing"]
Sidekick[Falcosidekick]
end
subgraph Storage["SIEM Storage"]
OS[OpenSearch]
end
subgraph Analysis["Security Analysis"]
OSD[OpenSearch Dashboards]
SIEM[Security Analytics Plugin]
Alerts[Alerting + Notifications]
end
Falco --> Sidekick
Sidekick --> OS
OS --> OSD
OS --> SIEM
SIEM --> Alerts
Why OpenSearch?
| Factor | OpenSearch | Elasticsearch | Loki |
|---|---|---|---|
| License | Apache 2.0 | BSL (proprietary) | AGPL 3.0 |
| Full-text search | Yes | Yes | No (label-based) |
| SIEM capabilities | Security Analytics plugin | X-Pack (paid) | No |
| Application search | Yes | Yes | No |
| Log aggregation | Possible but expensive | Possible but expensive | Optimized for this |
| Storage cost | Index-heavy | Index-heavy | Label-only (cheaper) |
| Dashboards | OpenSearch Dashboards | Kibana (BSL) | Grafana |
| API compatibility | ES 7.10 compatible | Native | LogQL |
Decision: Use OpenSearch for full-text application search and SIEM. Use Loki for operational log aggregation. They serve complementary purposes.
Key Features
| Feature | Description |
|---|---|
| Full-Text Search | BM25 scoring, analyzers, fuzzy matching, autocomplete |
| Security Analytics | SIEM plugin with detection rules, correlation, and threat intelligence |
| Index State Management | Automated index lifecycle (hot/warm/cold/delete) |
| Anomaly Detection | ML-based anomaly detection on time-series data |
| Alerting | Rule-based and anomaly-based alerting with webhook/email notifications |
| Snapshot/Restore | Automated backups to SeaweedFS/S3 |
| Cross-Cluster Search | Query across multiple OpenSearch clusters |
| Security Plugin | Fine-grained RBAC, field-level and document-level security |
| OpenSearch Dashboards | Visualization, dashboards, and notebook interface |
| Ingest Pipelines | Transform and enrich data during ingestion |
Configuration
OpenSearch Cluster (Helm)
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: opensearch
namespace: search
spec:
interval: 10m
chart:
spec:
chart: opensearch
version: "2.x"
sourceRef:
kind: HelmRepository
name: opensearch
namespace: flux-system
values:
clusterName: opensearch
masterService: opensearch
nodeGroup: master
replicas: 3
minimumMasterNodes: 2
roles:
- master
- ingest
- data
resources:
requests:
cpu: 1
memory: 4Gi
limits:
cpu: 4
memory: 8Gi
persistence:
enabled: true
storageClass: <storage-class>
size: 200Gi
config:
opensearch.yml: |
cluster.name: opensearch
network.host: 0.0.0.0
plugins.security.ssl.transport.pemcert_filepath: certs/tls.crt
plugins.security.ssl.transport.pemkey_filepath: certs/tls.key
plugins.security.ssl.transport.pemtrustedcas_filepath: certs/ca.crt
plugins.security.ssl.http.enabled: true
plugins.security.ssl.http.pemcert_filepath: certs/tls.crt
plugins.security.ssl.http.pemkey_filepath: certs/tls.key
plugins.security.ssl.http.pemtrustedcas_filepath: certs/ca.crt
extraEnvs:
- name: OPENSEARCH_INITIAL_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: opensearch-credentials
key: admin-password
OpenSearch Dashboards
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: opensearch-dashboards
namespace: search
spec:
interval: 10m
chart:
spec:
chart: opensearch-dashboards
version: "2.x"
sourceRef:
kind: HelmRepository
name: opensearch
namespace: flux-system
values:
opensearchHosts: "https://opensearch.search.svc:9200"
replicaCount: 2
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 1
memory: 1Gi
Index State Management (ISM) Policy
{
"policy": {
"policy_id": "siem-lifecycle",
"description": "SIEM index lifecycle: hot -> warm -> cold -> delete",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [],
"transitions": [
{ "state_name": "warm", "conditions": { "min_index_age": "7d" } }
]
},
{
"name": "warm",
"actions": [
{ "read_only": {} },
{ "force_merge": { "max_num_segments": 1 } }
],
"transitions": [
{ "state_name": "cold", "conditions": { "min_index_age": "30d" } }
]
},
{
"name": "cold",
"actions": [
{ "snapshot": { "repository": "seaweedfs-backups", "snapshot": "siem-{{ctx.index}}" } }
],
"transitions": [
{ "state_name": "delete", "conditions": { "min_index_age": "90d" } }
]
},
{
"name": "delete",
"actions": [{ "delete": {} }]
}
],
"ism_template": [
{ "index_patterns": ["falco-*", "security-*"], "priority": 100 }
]
}
}
Falco Integration (SIEM)
Falco runtime security events are shipped to OpenSearch via Falcosidekick:
# Falcosidekick output configuration
outputs:
opensearch:
hostPort: https://opensearch.search.svc:9200
index: falco
type: _doc
minimumPriority: notice
username: falco-writer
password:
secretKeyRef:
name: opensearch-falco-credentials
key: password
createIndexTemplate: true
SIEM Detection Rules
{
"name": "Container Escape Attempt",
"enabled": true,
"schedule": { "period": { "interval": 1, "unit": "MINUTES" } },
"inputs": [
{
"search": {
"indices": ["falco-*"],
"query": {
"bool": {
"must": [
{ "match": { "rule": "Container Escape" } },
{ "range": { "time": { "gte": "now-5m" } } }
]
}
}
}
}
],
"triggers": [
{
"name": "critical-security-alert",
"severity": "1",
"condition": { "script": { "source": "ctx.results[0].hits.total.value > 0" } },
"actions": [
{
"name": "notify-security-team",
"destination_id": "slack-security-channel",
"message_template": {
"source": "Container escape attempt detected. {{ctx.results[0].hits.total.value}} events in the last 5 minutes."
}
}
]
}
]
}
Monitoring
| Metric | Description |
|---|---|
opensearch_cluster_health_status |
Cluster health (green/yellow/red) |
opensearch_cluster_health_number_of_nodes |
Node count |
opensearch_indices_indexing_index_total |
Total documents indexed |
opensearch_indices_search_query_total |
Total search queries |
opensearch_jvm_mem_heap_used_percent |
JVM heap usage |
opensearch_indices_store_size_bytes |
Total index storage size |
Consequences
Positive:
- Apache 2.0 license with no proprietary feature gates
- Comprehensive SIEM capabilities when paired with Falco
- Full-text search for application use cases that Loki cannot serve
- Built-in security plugin with fine-grained access control
- Index lifecycle management automates data retention and archival
Negative:
- JVM-based, requires significant memory for indexing and search
- Full-text indexing is storage-intensive compared to label-based systems (Loki)
- Requires careful capacity planning for shard count and node sizing
- Two search/analytics systems to operate (OpenSearch + Loki) increases complexity
- Security plugin configuration can be complex for multi-tenant setups
Part of OpenOva