openova/products/fabric
hatiyildiz 7cafa3c894 docs(seaweedfs+guacamole): replace MinIO with SeaweedFS as unified S3 encapsulation; add Guacamole to bp-relay
Component-level architectural correction (two changes):

1. MinIO → SeaweedFS as unified S3 encapsulation layer

The old design used MinIO for in-cluster S3 plus separate cold-tier configuration scattered across consumers. The new design positions SeaweedFS as the single S3 encapsulation layer: every Catalyst component talks to one endpoint (seaweedfs.storage.svc:8333). SeaweedFS internally handles hot tier (in-cluster NVMe), warm tier (in-cluster bulk), and cold tier (transparent passthrough to cloud archival storage — Cloudflare R2 / AWS S3 / Hetzner Object Storage / etc., chosen at Sovereign provisioning). One audit/lifecycle/encryption boundary instead of N. No Catalyst component talks to cloud S3 directly anymore — Velero, CNPG WAL archive, OpenSearch snapshots, Loki/Mimir/Tempo, Iceberg, Harbor blob store, Application buckets all share one S3 surface.

2. Apache Guacamole added as Application Blueprint §4.5 Communication

Clientless browser-based RDP/VNC/SSH/kubectl-exec gateway. Keycloak SSO, full session recording to SeaweedFS for compliance evidence (PSD2/DORA/SOX). Composed into bp-relay. Replaces VPN+native-client distribution for auditable remote access.

Component changes:
- DELETED: platform/minio/
- CREATED: platform/seaweedfs/README.md (unified S3 + cold-tier encapsulation; bucket layout; multi-region replication via shared cold backend; migration-from-MinIO section)
- CREATED: platform/guacamole/README.md (clientless remote-desktop gateway; GuacamoleConnection CRD; compliance integration via session recordings)

Doc updates: PLATFORM-TECH-STACK §1+§3.5+§4.5+§5+§7.4; TECHNOLOGY-FORECAST L11+mandatory+a-la-carte counts (52 → 53); ARCHITECTURE §3 topology; SECURITY §4 DB engines; SOVEREIGN-PROVISIONING §1 inputs; SRE §2.5+§7; IMPLEMENTATION-STATUS §3; BLUEPRINT-AUTHORING stateful examples; BUSINESS-STRATEGY 13 component-count anchors + Relay product line; README.md backup row; CLAUDE.md folder count.

Component README updates (S3 endpoint + dependency renames): cnpg, clickhouse, flink, gitea, iceberg, harbor, grafana, livekit, kserve, milvus, opensearch, flux, stalwart, velero (substantive rewrite of velero — now writes exclusively to SeaweedFS with cold-tier auto-routing). Products: relay, fabric.

UI scaffold: products/catalyst/bootstrap/ui/src/shared/constants/components.ts — minio entry replaced with seaweedfs; velero+harbor deps updated; new guacamole entry added.

VALIDATION-LOG entry "Pass 104 — MinIO → SeaweedFS swap + Guacamole add" captures the encapsulation principle and adds Lesson #22: storage tier policy belongs at the encapsulation boundary, not inside every consumer.

Verification: zero remaining MinIO references in canonical docs (one intentional retention in TECHNOLOGY-FORECAST L37 explaining the swap); 53 platform/ folders matching all "53 components" anchors; bp-relay composition includes guacamole.
2026-04-28 10:23:46 +02:00
..
README.md docs(seaweedfs+guacamole): replace MinIO with SeaweedFS as unified S3 encapsulation; add Guacamole to bp-relay 2026-04-28 10:23:46 +02:00

OpenOva Fabric

Event-driven data integration and lakehouse analytics platform.

Status: Accepted | Updated: 2026-04-28


Overview

OpenOva Fabric merges data lakehouse and microservices integration into a single product. It provides event-driven data pipelines, stream processing, saga orchestration, and analytics — replacing the former separate Titan and Fuse products.

flowchart TB
    subgraph Sources["Data Sources"]
        CDC[Debezium CDC]
        Events[Event Producers]
    end

    subgraph Streaming["Event Streaming"]
        Kafka[Strimzi/Kafka]
    end

    subgraph Processing["Stream Processing"]
        Flink[Apache Flink]
    end

    subgraph Orchestration["Workflow Orchestration"]
        Temporal[Temporal]
    end

    subgraph Storage["Data Storage"]
        Iceberg[Apache Iceberg]
        ClickHouse[ClickHouse]
        SeaweedFS[SeaweedFS S3]
    end

    Sources --> Streaming
    Streaming --> Processing
    Streaming --> Orchestration
    Processing --> Storage
    Orchestration --> Streaming

Components

All components are in platform/ (flat structure):

Component Purpose Location
strimzi Apache Kafka event streaming platform/strimzi
flink Stream and batch processing platform/flink
temporal Saga orchestration + compensation platform/temporal
debezium Change data capture (CDC) platform/debezium
iceberg Open table format (lakehouse) platform/iceberg
clickhouse OLAP analytics database platform/clickhouse
seaweedfs Object storage (S3) platform/seaweedfs

Use Cases

Event-Driven Integration

Source DB → Debezium CDC → Kafka → Flink Processing → Target DB/Iceberg

Saga Orchestration

Temporal Workflow → Step 1 (Kafka) → Step 2 (Kafka) → Compensation on failure

Real-Time Analytics

Kafka → Flink → ClickHouse → Grafana Dashboards

Data Lakehouse

Kafka → Flink → Iceberg (SeaweedFS) → SQL queries via ClickHouse

Resource Requirements

Component Replicas CPU Memory
Strimzi/Kafka 3 2 8Gi
Flink JobManager 1 1 2Gi
Flink TaskManager 2 2 4Gi
Temporal 3 1 2Gi
ClickHouse 2 4 16Gi
Debezium 1 0.5 1Gi
Total - 14.5 41Gi

Deployment

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: fabric
  namespace: flux-system
spec:
  interval: 10m
  path: ./products/fabric/deploy
  prune: true
  sourceRef:
    kind: GitRepository
    name: openova-blueprints

Configuration

Parameter Description Default
ORGANIZATION Catalyst Organization identifier (per docs/GLOSSARY.md; previously labelled "tenant" — banned term) Required
SOVEREIGN_DOMAIN Sovereign's base domain (e.g. omantel.openova.io) Required
KAFKA_REPLICAS Kafka broker count 3
FLINK_PARALLELISM Flink task parallelism 2
CLICKHOUSE_SHARDS ClickHouse shard count 1

Part of OpenOva