6f9c0b261a
13 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
f5daac52af |
refactor(platform): remove k8gb — replaced by PowerDNS lua-records (#171)
PowerDNS lua-records (`ifurlup`, `pickclosest`, `ifportup`) cover everything k8gb was doing — geo-aware response selection, health-checked failover, weighted round-robin — at the authoritative DNS layer. Eliminates a separate K8s controller, CRD set, and CoreDNS plugin from every Sovereign. Changes: - platform/k8gb/ deleted (Chart.yaml, values.yaml, blueprint.yaml never authored — only README existed) - products/catalyst/bootstrap/ui/public/component-logos/k8gb.svg deleted - componentGroups.ts: remove k8gb component (PowerDNS already there) - componentLogos.tsx: drop logo_k8gb + k8gb map entry - model.ts DEFAULT_COMPONENT_GROUPS spine: replace k8gb with powerdns - StepInfrastructure.tsx: copy refers to PowerDNS lua-records, not k8gb - provision.html: replace k8gb tile and edges with powerdns - catalog.generated.ts regenerated (now includes bp-powerdns) - docs sweep — every k8gb reference in PLATFORM-TECH-STACK, NAMING- CONVENTION, SOVEREIGN-PROVISIONING, SRE, ARCHITECTURE, GLOSSARY, COMPONENT-LOGOS, IMPLEMENTATION-STATUS, BUSINESS-STRATEGY, TECHNOLOGY-FORECAST, README, infra/hetzner/README, platform READMEs (cilium, external-dns, failover-controller, litmus, flux, opentofu) rewritten to point at PowerDNS lua-records / MULTI-REGION-DNS.md. Historical entries in VALIDATION-LOG.md preserved as audit trail. - New docs/MULTI-REGION-DNS.md — canonical reference for the lua-record patterns (ifurlup all/pickclosest/pickfirst, ifportup, pickwhashed), Application Placement → lua-record selector mapping, when to add a second Sovereign region, operational checks. Closes #171. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
7cafa3c894 |
docs(seaweedfs+guacamole): replace MinIO with SeaweedFS as unified S3 encapsulation; add Guacamole to bp-relay
Component-level architectural correction (two changes): 1. MinIO → SeaweedFS as unified S3 encapsulation layer The old design used MinIO for in-cluster S3 plus separate cold-tier configuration scattered across consumers. The new design positions SeaweedFS as the single S3 encapsulation layer: every Catalyst component talks to one endpoint (seaweedfs.storage.svc:8333). SeaweedFS internally handles hot tier (in-cluster NVMe), warm tier (in-cluster bulk), and cold tier (transparent passthrough to cloud archival storage — Cloudflare R2 / AWS S3 / Hetzner Object Storage / etc., chosen at Sovereign provisioning). One audit/lifecycle/encryption boundary instead of N. No Catalyst component talks to cloud S3 directly anymore — Velero, CNPG WAL archive, OpenSearch snapshots, Loki/Mimir/Tempo, Iceberg, Harbor blob store, Application buckets all share one S3 surface. 2. Apache Guacamole added as Application Blueprint §4.5 Communication Clientless browser-based RDP/VNC/SSH/kubectl-exec gateway. Keycloak SSO, full session recording to SeaweedFS for compliance evidence (PSD2/DORA/SOX). Composed into bp-relay. Replaces VPN+native-client distribution for auditable remote access. Component changes: - DELETED: platform/minio/ - CREATED: platform/seaweedfs/README.md (unified S3 + cold-tier encapsulation; bucket layout; multi-region replication via shared cold backend; migration-from-MinIO section) - CREATED: platform/guacamole/README.md (clientless remote-desktop gateway; GuacamoleConnection CRD; compliance integration via session recordings) Doc updates: PLATFORM-TECH-STACK §1+§3.5+§4.5+§5+§7.4; TECHNOLOGY-FORECAST L11+mandatory+a-la-carte counts (52 → 53); ARCHITECTURE §3 topology; SECURITY §4 DB engines; SOVEREIGN-PROVISIONING §1 inputs; SRE §2.5+§7; IMPLEMENTATION-STATUS §3; BLUEPRINT-AUTHORING stateful examples; BUSINESS-STRATEGY 13 component-count anchors + Relay product line; README.md backup row; CLAUDE.md folder count. Component README updates (S3 endpoint + dependency renames): cnpg, clickhouse, flink, gitea, iceberg, harbor, grafana, livekit, kserve, milvus, opensearch, flux, stalwart, velero (substantive rewrite of velero — now writes exclusively to SeaweedFS with cold-tier auto-routing). Products: relay, fabric. UI scaffold: products/catalyst/bootstrap/ui/src/shared/constants/components.ts — minio entry replaced with seaweedfs; velero+harbor deps updated; new guacamole entry added. VALIDATION-LOG entry "Pass 104 — MinIO → SeaweedFS swap + Guacamole add" captures the encapsulation principle and adds Lesson #22: storage tier policy belongs at the encapsulation boundary, not inside every consumer. Verification: zero remaining MinIO references in canonical docs (one intentional retention in TECHNOLOGY-FORECAST L37 explaining the swap); 53 platform/ folders matching all "53 components" anchors; bp-relay composition includes guacamole. |
||
|
|
0a6179dd21 |
docs(unified-repo-model): collapse SME and corporate to one shape — Application = Gitea Repo
Architectural correction. Replaces the previous "one Gitea repo per Environment with Apps as folders" rule with a single uniform shape that scales by configuration only: - Catalyst Application = one Gitea Repo (always, regardless of scale) - Branches develop/staging/main map to dev/stg/prod environments - 5 conventional Gitea Orgs per Sovereign: catalog (public mirror), catalog-sovereign (Sovereign-curated private Blueprints), one per Catalyst Organization (with shared-blueprints + N App repos), system (sovereign-admin scope) - EnvironmentPolicy CR lives in system/catalyst-config/policies/, same shape for SME and corporate; only field values differ Removes the SME-vs-corporate dual-shape design that violated the "Application is application" invariant. Teams primitive (proposed for corporate scale) is dropped — team boundaries emerge from CODEOWNERS at the App-repo level. RE-score thresholds and EnvironmentPolicy fields are universal defaults; only their values vary per Org's policy choice. Files updated line-by-line: GLOSSARY (Application + Environment definitions, new Gitea-Orgs section, 6 component-row updates), NAMING §11.2 (Realization 7-bullet rewrite), ARCHITECTURE (§1, §3 topology, §4 write-side ASCII, §7.1+§7.2+§7.3, §8 promotion, §9 multi-App linkage), PERSONAS-AND-JOURNEYS (§2 surfaces, §4.1 Ahmed, §4.2 Layla full rewrite), BLUEPRINT-AUTHORING §1 (catalog-sovereign source location), PLATFORM-TECH-STACK §2.2+§2.3, SECURITY §3, SOVEREIGN-PROVISIONING §5+§8+§10, IMPLEMENTATION-STATUS §5, SRE §14. VALIDATION-LOG entry "Pass 103 — UNIFIED REPO MODEL REFACTOR" captures the architectural correction and acknowledges the prior 102-pass audit anchored on the wrong shape (text-shape consistency was correct; the chosen text-shape was inadequate). Lesson #21 added: text-shape audits don't substitute for architectural review. Verification: zero remaining old-model assertions in canonical docs (grep clean for 'Environment Gitea repo', '/{org}/{org}-{env_type}', 'per-Environment Gitea repos', 'applications/<app>/values', etc.). |
||
|
|
feb22552ea |
docs(pass-43): SRE §2.5 Gitea replication row contradicts gitea README; keda clean
§2.5 (Data replication patterns) line 106 had: Gitea | Bidirectional
mirror + CNPG primary-replica. Direct architectural contradiction with
platform/gitea/README.md "Multi-Region Strategy" which EXPLICITLY rejects
bidirectional mirror (write-conflict semantics, EnvironmentPolicy
enforcement). The canonical pattern is intra-cluster HA replicas +
CNPG primary-replica on the mgt cluster only — DR for Gitea is via
mgt-cluster recovery, not cross-region sync.
Same drift category as Pass 7 (component READMEs active-active) and
Pass 26 (BUSINESS-STRATEGY active-active OpenBao). The "active-active
for everything stateful" mental model survived in this row.
Fixed to canonical wording with inline pointer to gitea README.
SRE.md §1-§14 deep re-scan otherwise clean. §7.1 framing nit ("All
Catalyst control-plane components" lists per-host-cluster infra too)
flagged but not fixed — reads as "Catalyst-managed" in context.
§14 Runbooks <org>/runbooks path-placeholder unambiguous in context;
optional tightening flagged.
platform/keda/README.md: clean. mimir.monitoring.svc reference is the
per-host-cluster Mimir collector (consistent with dual-categorization
Pass 38 documented), not contradicting SRE §8.1's catalyst-grafana
namespace which is per-Sovereign self-monitoring.
|
||
|
|
329a36b54d |
docs(pass-24): SRE Alertmanager webhook URL form + livekit clean
SRE.md §12 (Alertmanager configuration) webhook URLs at lines 442/451 used
`gitea.<sovereign>.<domain>/...` — the two-segment placeholder is malformed
against NAMING §5.1 which establishes Catalyst control-plane DNS as
`{component}.{location-code}.{sovereign-domain}` (e.g. `gitea.hfmp.openova.io`).
Fixed both webhook URLs to `gitea.<location-code>.<sovereign-domain>/...`.
platform/livekit/README.md: clean — banner correct, integration tables
consistent with bp-cortex voice path.
Validation log Pass 24 entry added.
|
||
|
|
15905cee6f |
docs(iter-9-12): repo structure clarity, PLATFORM-TECH-STACK reorg, SRE alignment
README + CLAUDE.md (iter 9):
- README's "Build a Blueprint" section was contradicting itself: said
"A Blueprint is a Git repo" while elsewhere we'd locked in the
monorepo decision. Rewritten: Blueprint = a folder under
platform/<name>/ or products/<name>/ in this monorepo. CI publishes
per-folder OCI artifacts.
- CLAUDE.md "Repo structure": replaced the brief tree with a more
honest one that distinguishes target structure from current
placeholders (core/apps/ is target console+projector+...; current
has only legacy bootstrap/ and manager/ .gitkeep dirs). Annotated
each products/<name>/ folder with current state (axon = real code;
others = README only; catalyst = bootstrap/ui scaffold).
- CLAUDE.md banned-terms entry "Workspace": now covers component
names too (was only Catalyst scope), matching GLOSSARY's expanded
banned-term entry.
PLATFORM-TECH-STACK (iter 10) — substantive reorganization:
The §1 categorization established three buckets:
(a) Catalyst control plane (per-Sovereign on mgt)
(b) Per-host-cluster infrastructure (every host cluster)
(c) Application Blueprints (a la carte)
But §2 "Catalyst control plane components" was mixing buckets (a)
and (b): it listed flux, crossplane, cert-manager, kyverno, harbor,
external-secrets, reloader, vpa, keda, k8gb, coraza, falco, trivy,
sigstore, syft-grype, minio, velero, failover-controller all under
"Catalyst control plane" — but those are per-host-cluster
infrastructure per §1, and §1 itself said Crossplane "Never
user-facing" / per-host-cluster.
Reorganized §2 + §3:
- §2 now contains ONLY the Catalyst control plane:
2.1 User-facing surfaces (console, marketplace, admin)
2.2 Catalyst backend services (projector, catalog-svc, provisioning,
environment-controller, blueprint-controller, billing)
2.3 Per-Sovereign supporting services (keycloak, openbao, spire-
server, nats-jetstream, gitea, observability)
- New §3 Per-host-cluster infrastructure with subsections for
networking, GitOps+IaC, security+policy, scaling+ops, storage+
registry, resilience.
- Application Blueprints renumbered §3 → §4. Added missing
opensearch row to §4.1 (was previously misplaced in observability).
- Composite Blueprints (Products) §4 → §5.
- Multi-Region §5 → §6. Resource estimates §6 → §7. Cluster
deployment §7 → §8. User choice §8 → §9. SIEM §9 → §10. License §10 → §11.
Cross-doc references to PLATFORM-TECH-STACK §1 / §2 (in NAMING,
ARCHITECTURE, IMPLEMENTATION-STATUS) all still resolve correctly
under the new numbering.
SRE (iter 11):
- §2.4 split-brain table: "MongoDB" → "FerretDB" (MongoDB was
retired in favor of FerretDB-on-CNPG per project-memory).
- §2.5 data replication: clarified each row's layer (Application
Blueprint vs per-host-cluster vs Catalyst control plane) instead
of misclassifying MinIO/Harbor as Application Blueprints. Added
OpenSearch row.
- §3.1 Flagger and §3.2 Flipt: explicitly marked "Status: design,
not yet a deployed Blueprint" since they're "components to watch"
in TECHNOLOGY-FORECAST, not in the current PLATFORM-TECH-STACK §3
inventory.
BUSINESS-STRATEGY + TECHNOLOGY-FORECAST (iter 12):
- Final scan: clean. No tenant/operator-team/Catalyst-IDP/Lifecycle
Manager/Synapse(product) violations remaining.
Refs #37
|
||
|
|
80b91709e1 |
docs(iter-3-5): purge operator-as-entity, fix Workspace-controller capital, JetStream KV references
ARCHITECTURE (iter 3): - Removed catalystctl from the §4 write-side diagram (it's read-only; presenting it as a write input contradicted §7.4). - "Both tabs read the same Valkey snapshot" → "JetStream KV snapshot" in §5 (Valkey is no longer in the control plane). - §7.4: catalystctl reframed as "may exist as small read-only debug CLI" rather than implying it ships today. - §11 dependency list: added bp-catalyst-provisioning; removed bp-catalyst-crossplane (Crossplane is per-host-cluster infra, not a Catalyst control-plane component); added clarifying note. - §12 CRD list: added SecretPolicy + Runbook (were already in IMPLEMENTATION-STATUS but missing from the principles table). - §2 SME-style description: "SaaS Operator team (Omantel staff)" → "SaaS provider's cloud team" (Operator banned as entity). NAMING-CONVENTION (iter 4): - §5.1 heading "operator domain" → "Sovereign domain". - §7 multi-region diagram: replaced piecemeal Catalyst component list with a deferral to PLATFORM-TECH-STACK §2; added SPIRE server; fixed "per-Org workspaces" → "per-Environment Gitea repos"; added per-host-cluster infrastructure callout. SECURITY (iter 6 — partial; fold into this commit): - "operator-approved" → "sovereign-admin-approved" for DR promotion. - Realm name "catalyst-operator" → "catalyst-admin" (entity-noun scrubbed from the realm naming itself). SOVEREIGN-PROVISIONING (iter 7 — partial): - "single operator's laptop" → "single person's laptop" (avoid "operator" as entity). - "the next operator" → "the next Sovereign provisioning request, regardless of who initiates it". - "catalyst-operator realm" → "catalyst-admin realm" (×2). - Capital-W "Workspace-controller" residuals (3) → "Environment- controller" (replace_all is case-sensitive; previous iter caught lowercase only). PERSONAS (iter 5): - P3 "within a Sovereign Operator team" → "within a Sovereign's operations team". - Two capital-W "Workspace-controller" residuals fixed. SRE (iter 11 — partial): - §13.2 "Workspace-controller stuck" runbook entry → "Environment-controller stuck". Banned-term sweep result post-fix: no `Operator team|role|account| user|admin` anywhere; no capital-W Workspace as Catalyst scope; no Valkey-as-control-plane refs. Refs #37 |
||
|
|
2c4902b409 |
docs(iter-1): add IMPLEMENTATION-STATUS, fix wrong-org refs, reconcile monorepo
First validation iteration. Three concrete corrections. 1. Add docs/IMPLEMENTATION-STATUS.md as the bridge between target architecture and current code state. Status legend (✅ / 🚧 / 📐 / ⏸) applied per-component. Catalyst control plane = mostly 📐. Component READMEs = 🚧 (README only, no Blueprint manifests yet). products/axon = ✅ (only product with real code). core/ = 📐 (just .gitkeep). 2. Status banner added to ARCHITECTURE, SECURITY, SOVEREIGN-PROVISIONING, BLUEPRINT-AUTHORING, PERSONAS-AND-JOURNEYS, PLATFORM-TECH-STACK, SRE pointing readers at IMPLEMENTATION-STATUS.md before they treat any described feature as built. GLOSSARY also references it. 3. Architectural decision (Option A — monorepo canonical): - Each platform/<name>/ and products/<name>/ folder is the source of ONE Blueprint, published as ghcr.io/openova-io/<name>:<semver> by CI fan-out from the monorepo root. - BLUEPRINT-AUTHORING.md §1, §2, §13 rewritten to match. - README.md "what's in this repo" rewritten to clarify monorepo + OCI-fan-out shape; no longer claims every directory is a Blueprint in a way that contradicts BLUEPRINT-AUTHORING. Wrong-org fixes (3 places): - docs/PERSONAS-AND-JOURNEYS.md:13 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md:13 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md:404 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md ghcr.io/openova/* (3 refs) → openova-io API group consistency: - All references unified to catalyst.openova.io/v1alpha1 (was mixed v1 / v1alpha1; v1alpha1 is correct since the CRDs are design-stage with no implementation). core/README.md updated to honestly describe the directory tree as "target structure with .gitkeep placeholders" rather than implying the apps/console, apps/projector, etc. binaries already exist. The legacy apps/bootstrap and apps/manager directories are acknowledged as transitional placeholders that will be removed when the new apps/ layout is scaffolded. CLAUDE.md and .claude/project-memory.md updated to put IMPLEMENTATION-STATUS.md second in the read-first ordering. Refs #37 |
||
|
|
4b3a6884f5 |
docs(stack,sre): align tech stack and SRE handbook with Catalyst control plane
Two related rewrites that put the control plane / application Blueprint
distinction front and center.
PLATFORM-TECH-STACK.md
- §1: explicit three-way component categorization — Catalyst control
plane (one per Sovereign), per-host-cluster infrastructure (every
cluster), Application Blueprints (inside per-Org vclusters).
- §2: Catalyst control plane components listed by responsibility —
user-facing surfaces, backend services, identity, secrets, event
spine, GitOps, networking, security, scaling, storage,
observability, resilience.
- §3: Application Blueprints (the a-la-carte catalog) — Valkey and
Strimzi explicitly callout that they are Application Blueprints,
NOT control-plane components (control plane uses NATS JetStream).
- §4: composite Blueprints (Cortex, Axon, Fingate, Fabric, Relay)
repositioned as Applications running ON Catalyst, not as parallel
products.
- §5: multi-region diagram showing independent OpenBao Raft per
region, NATS leaf nodes, Crossplane on mgt.
- §6: resource estimates updated for control plane (~12 GB +
per-Org Keycloak in SME tier).
- §10: license posture table — every control-plane component carries
a redistribution-safe license (no BSL).
SRE.md
- §2: multi-region principles updated; explicit "no stretched
clusters" applies to OpenBao, JetStream, etcd, every quorum-
based component.
- §2.5: data replication patterns now scoped to Application
Blueprints (the things a customer installs), separate from
control-plane patterns documented in SECURITY.md and
ARCHITECTURE.md.
- §4: alert-to-action mapping segmented by Catalyst control plane
vs per-product (Cortex, Fingate); new alerts: OpenBaoSealed,
JetstreamLagHigh.
- §7-§13: terminology aligned to Catalyst (console instead of IDP);
runbooks now Runbook CRD-backed; incident severities updated.
- §13.2-13.3: Catalyst-specific incidents (workspace-controller,
OpenBao seal, projector lag) plus AI Hub incidents under
bp-cortex installation.
Refs #37
|
||
|
|
54b1b4bd3d |
docs: add unified naming convention and align existing docs
- Add docs/NAMING-CONVENTION.md — canonical naming standard for all cloud resources, K8s objects, DNS, and tags across all providers. Covers dimension taxonomy (provider/region/building-block/environment), the Don't-Repeat-the-Parent principle, 4-char DNS location codes with full lookup table, multi-tenant scoping via namespace, and migration rules. - Fix SRE.md: remove primary/DR region labels; clusters are named by building block (rtz/dmz/mgt), not failover role. Both regions run symmetric rtz clusters; k8gb owns traffic distribution. - Fix PLATFORM-TECH-STACK.md: update both Mermaid diagrams and region table to use Region A / Region B (rtz cluster) language. - Fix core/README.md: Platform CRD example now references cluster context names (hz-fsn-rtz-prod / hz-hel-rtz-prod) instead of primary/standby roles. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
435f49738d |
feat: restructure platform to 52 components and 9 products
Technology forecast and strategic review restructure: - Remove 13 components (backstage, mongodb, activemq, vitess, airflow, camel, dapr, superset, searxng, langserve, trino, lago, rabbitmq) - Add 10 components (sigstore, syft-grype, nemo-guardrails, langfuse, reloader, matrix, ferretdb, litmus, livekit, coraza) - Rename product: Synapse → Axon (SaaS LLM Gateway) - Merge products: Titan + Fuse → Fabric (Data & Integration) - New product: Relay (Communication) - Replace Backstage with Catalyst IDP - Replace MongoDB with FerretDB (MongoDB wire protocol on CNPG) - Add supply chain security (Sigstore/Cosign, Syft+Grype) - Add AI safety and observability (NeMo Guardrails, LangFuse) - Add technology forecast 2027-2030 document - Full verification pass: zero stale references across all docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
10245dff98 |
feat: ecosystem expansion to 55 components with license compliance
- Replace BSL-licensed components with open-source alternatives: Terraform→OpenTofu (MPL 2.0), Vault→OpenBao (MPL 2.0), Redpanda→Strimzi/Kafka (Apache 2.0), n8n→Airflow (Apache 2.0) - Add 14 new platform components: activemq, camel, clickhouse, dapr, debezium, falco, flink, iceberg, opensearch, rabbitmq, superset, temporal, trino, vitess - Rename meta-platforms/ to products/ with new product names: Cortex (AI Hub), Fingate (Open Banking), Titan (Data Lakehouse), Fuse (Microservices Integration) - Update all documentation, READMEs, and cross-references Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
535710289c |
feat: create OpenOva monorepo structure
Consolidate all component repos into a single monorepo: - core/: Bootstrap + Lifecycle Manager application - platform/: Individual component blueprints organized by category - networking/ (cilium, k8gb, external-dns, stunner) - security/ (cert-manager, external-secrets, vault, kyverno, trivy) - observability/ (grafana stack) - storage/ (minio, harbor, velero) - scaling/ (keda, vpa) - failover/ (failover-controller) - gitops/ (flux, gitea) - idp/ (backstage) - data/ (cnpg, mongodb, valkey, redpanda) - communication/ (stalwart) - iac/ (terraform, crossplane) - identity/ (keycloak) - meta-platforms/: Bundled vertical solutions - ai-hub/ (enterprise AI platform) - open-banking/ (PSD2/FAPI fintech sandbox) - docs/: Platform documentation (PLATFORM-TECH-STACK.md, SRE.md) All internal links updated to use relative paths within monorepo. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |