core/README.md "User journeys" table had: "Sovereign bootstrap | Phase 0
done by catalyst-provisioner; this codebase contains the OpenTofu modules
under apps/provisioning/opentofu/..." — conflating two distinct services.
Per SOVEREIGN-PROVISIONING.md §2, catalyst-provisioner is a separate
Blueprint (bp-catalyst-provisioner) — explicitly "not part of any
Sovereign at runtime" — and lives outside core/. The core/apps/provisioning/
service is for runtime Application provisioning (validate configSchema,
compose manifests, commit to Environment's Gitea repo), an entirely
different concern from Phase 0 Sovereign bootstrap. Rewritten to call out
the separation.
platform/neo4j/README.md: clean.
Recurring shorthand note: ws.<env>.> JetStream subjects in core/README +
ARCHITECTURE (5 instances) treated as documented shorthand — precise form
per NAMING §11.2 is ws.{org}-{env_type}.>. Tightening deferred.
Validation log Pass 30 entry added.
The recurring drift: Catalyst control-plane DNS placeholders that omit the
<location-code> segment, producing forms like gitea.<sovereign>,
gitea.<sovereign>.<domain>, gitea.<sovereign-domain>, keycloak.<domain>.
Per NAMING §5.1 the canonical form is
{component}.{location-code}.{sovereign-domain} (e.g. gitea.hfmp.openova.io).
The shorter forms aren't just abbreviations — they collapse the multi-region
location dimension and re-drift every time a reader reads them as obvious
shorthand.
Fixes:
- CLAUDE.md "Customer Sync" — both gitea.<sovereign>/catalog/... lines.
- docs/SOVEREIGN-PROVISIONING.md §3 DNS-records bullet (3 lines) + §5
Day-1 login line.
- docs/ARCHITECTURE.md §4 write-path Gitea label.
- docs/BLUEPRINT-AUTHORING.md §6.4 private-Blueprint Studio target.
- platform/librechat/README.md Keycloak issuer (Pass 22 marked clean and
missed this — banner scans miss YAML-block drift).
platform/nemo-guardrails/README.md verified clean.
Final grep confirms only canonical forms remain. Validation log Pass 29
entry added with the recurring-drift-pattern note for future passes.
opensearch was listed under "Mandatory Components" but per PLATFORM-TECH-STACK
§4.4 + §10 it is an Application Blueprint — customers install it (alongside
ClickHouse + bp-specter) only when they want the SIEM pipeline. Conversely
keycloak was under "A La Carte Components" but §2.1 places it inside the
Catalyst control plane (per-Org realms in SME, per-Sovereign realm in
corporate — present on every Sovereign).
Swapped the two entries and added a classification-basis banner above the
Mandatory section explicitly pointing at PLATFORM-TECH-STACK §2/§3/§4 so the
forecast's Mandatory/A-la-carte axis lines up with the architectural
categorization in canonical docs.
platform/milvus/README.md: clean.
Validation log Pass 27 entry added.
§8.4 (CISO value prop) still described "OpenBao per-cluster with ESO PushSecrets
for cross-cluster secret sync" — the active-active model SECURITY §5 rejected
and Pass 7 corrected in component READMEs. Replaced with per-region independent
Raft + async Performance Replication; ESO scoped to in-region. Added the SPIFFE/
SPIRE 5-minute SVID line that fits the CISO frame.
§5.1 (Product Family) had two entries — "OpenOva (the core platform)" and
"OpenOva Catalyst (the platform)" — describing the same thing under two names.
Per GLOSSARY: OpenOva is the company, Catalyst is the platform. Removed the
duplicate "OpenOva" row, expanded the Catalyst row to absorb its content, and
added a Company/Platform/Sovereign vocabulary banner above the table.
§5.2 (Architecture Relationship diagram) had OPENOVA at the top as the platform.
Replaced with CATALYST + a footer clarifying each child is a composite Blueprint.
platform/matrix/README.md: clean.
Validation log Pass 26 entry added.
platform/llm-gateway/README.md had three malformed DNS placeholders:
- KEYCLOAK_URL collapsed location-code + sovereign-domain into <domain> and
used Application namespace `ai-hub` as a Keycloak realm name. Per NAMING §7
and SECURITY §7, Keycloak realms are per-Org in SME-style or per-Sovereign
in corporate-style — never per-Application-namespace. Fixed to
`keycloak.<location-code>.<sovereign-domain>/realms/<org>`.
- ANTHROPIC_BASE_URL and `claude config set api_base` examples used
`llm-gateway.ai-hub.<domain>/v1` — but NAMING §5.2 establishes
Application endpoints as `{app}.{environment}.{sovereign-domain}`.
Fixed to `llm-gateway.<env>.<sovereign-domain>/v1`.
docs/IMPLEMENTATION-STATUS.md confirmed clean: CRD list, surfaces, and
control-plane component list all match canonical docs.
Sweep concern logged for `harbor.<domain>` / `:latest` image patterns
appearing across many platform READMEs — to be addressed in a dedicated
sweep pass rather than asymmetrically here.
Validation log Pass 25 entry added.
SRE.md §12 (Alertmanager configuration) webhook URLs at lines 442/451 used
`gitea.<sovereign>.<domain>/...` — the two-segment placeholder is malformed
against NAMING §5.1 which establishes Catalyst control-plane DNS as
`{component}.{location-code}.{sovereign-domain}` (e.g. `gitea.hfmp.openova.io`).
Fixed both webhook URLs to `gitea.<location-code>.<sovereign-domain>/...`.
platform/livekit/README.md: clean — banner correct, integration tables
consistent with bp-cortex voice path.
Validation log Pass 24 entry added.
Pass 23 — drift-detection on PLATFORM-TECH-STACK §6-§11 (less-
scrutinized in earlier passes) + platform/litmus.
§7.1 Resource estimates:
- Crossplane was listed under "Catalyst control plane" — but
Crossplane is per-host-cluster infrastructure per §3.2. Same
categorization slip pattern as the §3 topology fix in Pass 6.
- Split into:
* §7.1 (Catalyst-specific only): +SPIRE server row that was
missing; subtotal corrected to ~11.3 GB. Removed Crossplane.
* New §7.4 (Per-host-cluster overhead): explicit breakdown for
Cilium / Flux / Crossplane / cert-manager / ESO / Kyverno /
Trivy / Falco / Harbor / MinIO / Velero / small operators.
Subtotal ~8.8 GB per host cluster.
- §7.2 heading renamed "Per-Organization vcluster (workload
regions)" for clarity.
§10 SIEM/SOAR:
- "This pipeline is itself a composite Blueprint (bp-siem)" — but
bp-siem doesn't exist in §5's composite Blueprint inventory.
The SIEM pipeline is a COMPOSITION of existing Application
Blueprints (Strimzi + OpenSearch + ClickHouse + bp-specter on
top of per-host-cluster Falco/Trivy/Kyverno), not a single
packaged composite.
- Reworded to make the actual composition explicit. Audit-log
fallback now correctly points at the Grafana stack
(per-Sovereign observability) rather than implying SIEM is
required for any audit retention.
platform/litmus/README.md: clean. Banner correct, integration
table consistent (Grafana, Kyverno, Gitea Actions, failover-
controller integrations all match the agreed model).
VALIDATION-LOG: Pass 23 entry added.
Refs #37
Pass 22 — drift-detection on PERSONAS-AND-JOURNEYS + platform/librechat.
One real fix.
PERSONAS-AND-JOURNEYS.md §6.3 Environment view example:
- "Environment: bankdhofar-corp-banking-prod" — three-segment form
implying Sovereign-Org-EnvType. But NAMING-CONVENTION §11.1
establishes `{org}-{env_type}` — the Sovereign name is NOT in
the Environment name. The Sovereign is determined by which
Catalyst console you're logged into.
- This same doc's §4.2 (Layla narrative) explicitly says
"Their internal Organizations are `core-banking`, `digital-
channels`, `analytics`, `corporate-it`" — so the Org is
`core-banking`, and the Environment in that Org for production
is `core-banking-prod`.
- Fixed example to `core-banking-prod`.
platform/librechat/README.md: clean. The example
`namespace: ai-hub` is a customer-chosen Application namespace
(illustrative; the actual namespace would be the Cortex Application
name, customer-chosen).
VALIDATION-LOG: Pass 22 entry added.
Refs #37
Pass 21 — drift-detection on BLUEPRINT-AUTHORING + platform/langfuse.
One real fix.
BLUEPRINT-AUTHORING.md §11 (CI pipeline):
- Old version showed `on: push # branch: main # tags: vX.Y.Z` — the
per-Blueprint-repo CI shape that was explicitly rejected when we
locked Option A (monorepo canonical) in Pass 1.
- §2 already establishes monorepo + path-matrix tag form
`platform/<name>/v1.2.3` / `products/<name>/v1.2.3`. §11 should
have matched §2 from the start; this slipped through previous
passes.
- Rewrote §11: single root-level CI, on.pull_request.paths triggers
validate, on.push.tags: platform/*/v* | products/*/v* triggers
build-and-sign with tag-parse → folder-detect → fan-out publish.
Includes worked example: tagging `platform/wordpress/v1.3.0`
builds `platform/wordpress/` and publishes
ghcr.io/openova-io/bp-wordpress:1.3.0.
platform/langfuse/README.md: clean. Banner correct. "Used by:
OpenOva Cortex" is acceptable commercial phrasing alongside the
technical bp-cortex reference.
VALIDATION-LOG: Pass 21 entry added.
Refs #37
Pass 20 — drift-detection on SOVEREIGN-PROVISIONING + platform/kyverno.
Two real findings.
SOVEREIGN-PROVISIONING.md §8:
- "Existing Applications with `placement: active-active: false,
single-region` do not migrate automatically" — invalid YAML
mixing a boolean with an enum. The canonical placement model
(per GLOSSARY) has `placement.mode: single-region | active-
active | active-hotstandby`, no boolean toggle.
- Rewrote: "Existing Applications with `placement.mode: single-
region` ... user explicitly switches Placement to active-active
(or active-hotstandby) and adds the new region to
placement.regions".
platform/kyverno/README.md:
- Policy V5 (minimum-replicas-production) targeted namespaces
labeled `openova.io/env: production` — out-of-spec label name
AND value. NAMING-CONVENTION §6 establishes `openova.io/env-type:
prod` (hyphen-form, short value).
- Fixed to `openova.io/env-type: prod`.
Both findings show the same pattern: schema-level details that
survive grep-based banned-term checks but contradict the canonical
spec when read in body.
VALIDATION-LOG: Pass 20 entry added.
Refs #37
Pass 18 — drift-detection on NAMING-CONVENTION + platform/keycloak.
Two real findings.
NAMING-CONVENTION §11.1:
- The example list of Catalyst Environments included `bankdhofar-dr`
— but `dr` is NOT a valid env_type. Canonical values per §2.4 are
prod / stg / uat / dev / poc. DR is a Placement mode
(active-active / active-hotstandby across regions inside the
*-prod Environment), not a separate Environment.
- Replaced `bankdhofar-dr` with `bankdhofar-uat` and added an
explicit "DR is a Placement, not an Env Type" note.
platform/keycloak/README.md:
- Keycloak Deployment YAML example used `namespace: open-banking`
with 2 replicas — Fingate-specific narrative that contradicted
the per-Org / per-Sovereign topology stated in the banner.
Rewrote with two side-by-side examples:
* shared-sovereign (3 HA replicas, catalyst-keycloak namespace,
CNPG-backed)
* per-organization (1 replica in <org> namespace, optional
embedded DB for smallest SME tier)
- HA section was a single set of claims (2+ replicas, CNPG, Infinispan)
that only matched corporate. Now branches on topology — corporate
gets HA + Infinispan, SME gets single replica with restart-on-
deploy as acceptable for tier SLAs.
Same kind of drift Pass 17 caught in Harbor: banner says one thing,
body still describes the older model. Both fixed.
VALIDATION-LOG: Pass 18 entry added.
Refs #37
Pass 17 — drift-detection sweep on ARCHITECTURE + harbor. Two real
findings.
ARCHITECTURE §13 (OAM table):
- `| Trait | Blueprint overlay (`overlays/small|medium|large`) |`
has pipe chars inside backticks inside a Markdown table cell —
a known GFM rendering hazard. Replaced with comma-separated
examples.
platform/harbor/README.md:
- The banner added in Pass 9 said "every host cluster runs a
Harbor instance" but the body still described an older
"Harbor Primary / Harbor Replica" cross-region replication
topology. Same shape of architectural drift Pass 7 caught in
OpenBao/ESO/Gitea/Flux — banner-add doesn't rewrite the body.
- Three sections rewritten:
* Overview mermaid: now shows upstream-OCI → multiple
independent per-cluster Harbors with local Trivy scan + local
Pod pulls.
* "Multi-Region Replication" → "Per-host-cluster mirroring (NOT
primary-replica)". Single source of truth = upstream OCI
(ghcr.io/openova-io/* for Catalyst+Blueprints, customer CI for
application images), not a "primary Harbor".
* Example replication policy: was a `dest_registry` cross-region
push policy → now a pull-mirror policy from ghcr.io with
scheduled-cron trigger.
- "Why Mandatory" table reframed in per-host-cluster terms.
VALIDATION-LOG: Pass 17 entry added with the specific drift-detection
lesson — banner-addition passes don't catch body-level drift; need
explicit body re-reads.
Refs #37
Pass 15 swept all 52 platform/*/README.md files for the role-in-
Catalyst banner. 3 still lacked one (cnpg, flux, strimzi) and got
banners added:
- cnpg (§4.1): production Postgres; underlying engine for FerretDB +
Gitea metadata.
- flux (§3.2): per-vcluster Flux + host-level Flux for Catalyst
itself; pulls from single per-Sovereign Gitea.
- strimzi (§4.1): Application-tier event streaming; NOT the Catalyst
control-plane spine (which uses NATS JetStream). Same upstream-
tech-different-tier disambiguation pattern as Valkey.
CONVERGENCE: 52 / 52 platform components have role-in-Catalyst
banners. All cross-refs resolve. No banned terms. No architectural
drift detected on this pass.
VALIDATION-LOG: Pass 15 entry + "Convergence achieved (initial
banner sweep)" marker added. The validation loop continues per
the standing instruction — but subsequent passes will be brief
drift-detection sweeps rather than systematic rewrites.
Refs #37
Seven more Application Blueprint banners landed:
- temporal (§4.3): durable workflow orchestration; bp-fabric.
- flink (§4.3): stream + batch processing; bp-fabric.
- debezium (§4.2): CDC into Strimzi/Kafka; bp-fabric pipeline source.
- iceberg (§4.4): open table format on MinIO + archival S3.
- openmeter (§4.8): API metering for bp-fingate.
- litmus (§4.9): chaos engineering required by DORA / NIS2.
- valkey (§4.1): banner explicitly states NOT a Catalyst control-
plane component — control plane uses NATS JetStream KV per
ARCHITECTURE §5 / GLOSSARY event-spine. Valkey is Application-tier
caching only. This is the disambiguation that PLATFORM-TECH-STACK
§1 establishes ("same upstream technology can serve in multiple
categories") — pinned in the per-component README so it can't be
misread.
VALIDATION-LOG: Pass 14 entry added.
Refs #37
All 4 communication components (composing under bp-relay) got role-
in-Catalyst banners pointing at PLATFORM-TECH-STACK §4.5:
- stalwart: JMAP/IMAP/SMTP self-hosted email.
- livekit: WebRTC SFU for video/audio/data; pairs with STUNner.
- stunner: K8s-native TURN/STUN for WebRTC NAT traversal.
- matrix: Matrix protocol via Synapse server. Banner explicitly
disambiguates "Synapse" as the chat-server implementation, NOT
the deprecated OpenOva product noun (retired in favor of bp-axon).
All 4 are explicitly Application Blueprints, NOT Catalyst control
plane.
VALIDATION-LOG: Pass 13 entry added.
Refs #37
7 more component READMEs got role-in-Catalyst banners:
- vpa, keda, reloader → per-host-cluster scaling/ops layer (§3.4).
Reloader specifically calls out its role in Catalyst's secret-
rotation flow (rolling deploy on K8s Secret hash change).
- external-dns → per-host-cluster DNS-sync (§3.1); pairs with k8gb
for the GSLB zone separation.
- coraza → DMZ-block WAF on every host cluster (§3.1).
- crossplane → per-Sovereign on the management cluster (§3.2);
banner explicitly emphasizes the agreed "never a user-facing
surface" rule (Users don't write Compositions in Application
configs; Blueprint authors and advanced contributors do). Cross-
references the no-fourth-surface clause in ARCHITECTURE §4/§7
and the Crossplane Composition section in BLUEPRINT-AUTHORING §8.
- opentofu → repositioned as Phase-0-only, runs on `catalyst-
provisioner` only, NOT installed on host clusters at runtime.
opentofu drift fixes (uncovered by line-by-line read):
- Section 5 line 182: "Bootstrap Wizard prompts for cloud credentials"
→ "Catalyst Bootstrap (Phase 0) prompts for cloud credentials"
(banned term).
- Same section line 186: "ESO PushSecrets sync to both regional
OpenBao instances" — the active-active drift Pass 7 corrected
elsewhere, still here. Replaced with "writes go to the primary
OpenBao region only; replicas pick up via async perf replication".
VALIDATION-LOG: Pass 10 entry added.
Refs #37
Pass 9 — six more component READMEs got Catalyst-role banners
matching the rule of thumb in CLAUDE.md (every platform/<x>/README.md
should state its role in Catalyst).
- grafana: observability stack on every host cluster; Catalyst's
own self-monitoring + Application telemetry flows here.
- harbor: per-host-cluster container registry for Catalyst images,
mirrored Blueprint OCI artifacts, customer images.
- falco: runtime security on every host cluster; feeds SIEM/SOAR.
- kyverno: policy engine on every host cluster; enforces Catalyst
policy contracts (cosign on Blueprints, default-deny NetworkPolicies
on Organization namespaces, priority-class injection).
- sigstore: cosign-signed Blueprint OCI artifacts + admission
verification chain on every host cluster.
- syft-grype: SBOM generation in CI per Blueprint + runtime CVE scans.
Plus Kyverno priority-class clarification: prose around `tenant-high`
/ `tenant-default` / `tenant-batch` priority class names now reads
"Organization workloads" instead of "tenant workloads", with an
explicit note that the priority class artifact names themselves stay
as-is until a separate migration ticket renames them in deployed
clusters (renaming PriorityClass objects requires recreate, not
in-place rename).
VALIDATION-LOG: Pass 9 entry added.
Refs #37
Pass 8 — line-by-line read of platform/cnpg, platform/strimzi,
platform/k8gb, platform/keycloak, platform/cert-manager, platform/cilium.
CNPG and Strimzi: read in full and confirmed clean — they correctly
position themselves as Application Blueprints and don't drift from
the canonical model. CNPG's `<org>-postgres-dr` cluster name
(Application-tier database role) is acceptable per NAMING-CONVENTION
§1.3 (which only forbids primary/dr in K8s host-cluster names, not
in Application-internal CRD names).
Four READMEs updated:
k8gb:
- Header reframed: per-host-cluster infrastructure pointer to
PLATFORM-TECH-STACK §3.1 and SRE §2.4 split-brain protection.
- Removed dead link to ../failover-controller/docs/ADR-FAILOVER-
CONTROLLER.md (the failover-controller folder has no docs/);
replaced with link to that component's README + SRE §2.4.
keycloak:
- Header reframed from "FAPI Authorization Server for Open Banking"
(narrow) to "User identity for Catalyst Sovereigns" (broad).
Keycloak handles ALL user identity in Catalyst, not just FAPI.
- Added per-Org / per-Sovereign topology callout matching SECURITY
§6. Clarified that "Multi-tenant TPP" refers to PSD2 Third Party
Providers, not Catalyst's Organization-level multi-tenancy.
- FAPI features kept since Keycloak still serves Fingate as the
FAPI Authorization Server.
cert-manager:
- Header reframed as per-host-cluster infrastructure with pointer
to PLATFORM-TECH-STACK §3.3.
cilium:
- Header reframed as per-host-cluster infrastructure with pointer
to PLATFORM-TECH-STACK §3.1, including the install-first note
(CNI must come before any other workload during Phase 0).
VALIDATION-LOG: Pass 8 entry added.
Refs #37
Continuing Pass 7 cleanup after the OpenBao/ESO rewrite (42aeb62).
Gitea README:
- Was describing "Bidirectional mirroring for multi-region" with two
Gitea instances mirroring repos cross-region. Wrong: Catalyst's
agreed model has one Gitea per Sovereign on the management cluster
(PLATFORM-TECH-STACK §2.3). Replaced the multi-region mirror
diagram with a single-Gitea + intra-cluster HA topology and added
a "Why not cross-region bidirectional mirror" explainer (write-
conflict semantics would break EnvironmentPolicy enforcement).
- Status banner: notes the canonical references.
- Backup section: removed "Repository mirror for redundancy"
(replaced with Velero scheduled backups).
Flux README:
- "Multi-Region GitOps" section was showing one Gitea per region
with bidirectional mirror. Replaced with one Gitea per Sovereign
topology. Per-vcluster Flux pulls from this single Gitea.
Mermaid syntax bug:
- Earlier mass replace_all of "Catalyst IDP" → "Catalyst console"
had left an invalid mermaid node identifier
`Catalyst console[Catalyst console]` (mermaid forbids spaces in
node IDs). Fixed to `Console[Catalyst console]`. Would have
rendered as a broken diagram on GitHub.
VALIDATION-LOG: Pass 7 entry added documenting the OpenBao/ESO
active-active rewrite (the most consequential drift fix in any pass).
Refs #37
Pass 7 — line-by-line read of platform/openbao/README.md and
platform/external-secrets/README.md found a major architectural drift:
both files described an OLD active-active bidirectional sync model
that contradicts docs/SECURITY.md §5 (the canonical reference).
The active-active design was rejected during the architecture session
because it would have been a stretched cluster — a single region's
network blip would block writes everywhere. The agreed model is:
- Independent Raft cluster per region (intra-region quorum only).
- Single-primary writes; replicas accept reads only.
- Async Performance Replication primary → replicas (lag <1s typical).
- Explicit DR promotion (sovereign-admin or failover-controller).
Fixes:
platform/openbao/README.md:
- Overview: removed "active-active deployments" / "either region can
update secrets". Replaced with "independent Raft cluster per region",
"asynchronous Performance Replication".
- Architecture diagram: replaced bidirectional-push diagram with the
primary→replicas async perf replication topology that matches
SECURITY.md §5.
- ClusterSecretStores: simplified from "two stores (local+remote)" to
"one local store"; reads always pull locally.
- Renamed "PushSecret (Bidirectional)" → "Writes go to the primary
region" with a single-target PushSecret pointing at bao-primary.
- Added DR promotion section pointing at SECURITY.md §5.2.
- Status banner: notes that the canonical multi-region reference is
SECURITY.md.
platform/external-secrets/README.md:
- Header line: repositioned as per-host-cluster infrastructure with
pointer to PLATFORM-TECH-STACK §3.3.
- Removed broken link to non-existent ../openbao/docs/ADR-OPENBAO.md
(replaced with link to ../openbao/README.md).
- "Multi-region sync | Push to both OpenBao instances simultaneously"
→ "Multi-region reads | Async perf replication".
- "PushSecret to Multiple OpenBao Instances" example was writing to
two ClusterSecretStores in parallel — replaced with single-target
primary write.
- "Multi-region sync via single PushSecret" in Consequences →
"Cross-region availability via Performance Replication".
- Mermaid sequence diagram: "Bootstrap Wizard" actor → "Catalyst
Bootstrap (Phase 0)"; "Terraform" → "OpenTofu"; ESO connection
description "via K8s auth" → "via SPIFFE SVID (workload identity)".
These were the most consequential drift fixes found in any pass —
two READMEs were documenting an architecture explicitly rejected by
the agreed model.
Refs #37
Pass 6 — fresh-eyes line-by-line read of ARCHITECTURE.md. Found two
internal contradictions that earlier passes missed.
ARCHITECTURE §3 (topology diagram) listed Crossplane, Flux, Harbor,
and grafana-stack INSIDE the Catalyst control plane block. But §11
(Catalyst-on-Catalyst) explicitly says these are per-host-cluster
infrastructure, NOT Catalyst control-plane components. PLATFORM-TECH-
STACK §3 also classifies them as per-host-cluster.
Fixed: §3 topology diagram now shows only true Catalyst control-plane
components (console, marketplace, admin, catalog-svc, projector,
provisioning, environment-controller, blueprint-controller, billing,
gitea, nats-jetstream, openbao, keycloak, spire-server, observability)
and adds a separate line for "Plus per-host-cluster infrastructure"
that defers to PLATFORM-TECH-STACK §3 for the full list (Cilium, Flux,
Crossplane, cert-manager, ESO, Kyverno, Harbor, Reloader, Trivy, Falco,
Sigstore, Syft+Grype, VPA, KEDA, External-DNS, k8gb, Coraza, MinIO,
Velero, failover-controller). Also added the previously-missing
`provisioning` row.
JetStream Account scoping was contradictory:
- ARCHITECTURE §5 said "Per-Org account: ws.{org}-{env_type}.>" —
reads ambiguously: is the Account per-Org or per-Env?
- NAMING-CONVENTION §11.2 said "One JetStream Account scoped to
ws.{org}-{env_type}.>" — implied per-Environment.
- GLOSSARY + PLATFORM-TECH-STACK + SECURITY all say per-Organization.
Reconciled to the per-Org-Account-with-per-Env-subjects model:
- Account isolation: ONE NATS Account per Organization.
- Subjects within the Account use prefix `ws.{org}-{env_type}.>` for
per-Environment partitioning.
This is the cleanest isolation model: Accounts are NATS' strongest
isolation boundary (per-Org); subjects partition further within each
Account (per-Env).
Refs #37
Concluding the validation loop with a process artifact. The new file
records:
- Why the validation existed (post-rewrite trust verification).
- Each pass's scope and concrete fixes (16 iterations across Pass 1
+ sweeps in Passes 2/3/4/5).
- The acceptance criteria as runnable grep commands so any future
contributor can re-verify.
- Authorship convention (hatiyildiz, per-commit identity flags).
- Re-validation cadence (after rewrites, after new banned terms,
after component renames, quarterly drift check).
Linked from README.md docs table.
This file is meant as a playbook for the next validation, not a
status snapshot — for status, IMPLEMENTATION-STATUS.md remains
canonical.
Refs #37
Two user-facing residuals where the banned product term "instance"
slipped through:
- docs/ARCHITECTURE.md §9: example console dialog "Use existing
instance or create a dedicated one?" → "Use an existing Postgres
Application or create a new dedicated one?". This is a UI prompt
text — must use the user-facing noun "Application", not "instance".
- docs/NAMING-CONVENTION.md §6.2 tag comment: "Application instance
name" → "Application name within the Environment". The CRD might
internally still use the noun Instance for class-vs-instance
semantics, but in tag annotations and user-visible context the
Application IS the instance.
Other "instance" occurrences confirmed legitimate (Postgres instance
as Crossplane resource type, Flux instance as software deployment,
EC2/Hetzner instance as cloud-provider terminology) and retained.
Final cross-reference check: all Markdown links across all canonical
docs resolve. No residual banned terms.
Refs #37
ARCHITECTURE §10 listed 3 provisioning phases (Phase 0 / 1 / 2) and
labeled Phase 2 as "Self-sufficient". SOVEREIGN-PROVISIONING.md uses
4 phases (Phase 0 Bootstrap / Phase 1 Hand-off / Phase 2 Day-1 setup
/ Phase 3 Steady-state). The same phase number meant different things
in the two docs.
Aligned ARCHITECTURE to the 4-phase numbering. SOVEREIGN-PROVISIONING
is now explicitly the canonical reference for phase semantics.
Refs #37
PERSONAS-AND-JOURNEYS and SECURITY were using two competing slugs
for the same example Organization:
- "muscat-pharmacy" (with hyphen) — used as Org name + Environment
name in the Ahmed journey narrative.
- "muscatpharmacy" (no hyphen) — used as the vcluster name in the
same paragraph, and used everywhere else (NAMING-CONVENTION
examples, ARCHITECTURE topology diagram, SECURITY SPIFFE ID).
NAMING §2.5 allows both spellings (Org slug regex permits hyphens).
But within a single example the spelling must be stable, otherwise
readers see a contradiction between Org and vcluster names.
Normalized to single-token "muscatpharmacy" throughout (matches the
predominant usage and produces simpler URLs / paths).
Result: all docs now show the same example Org consistently —
muscatpharmacy as Org, muscatpharmacy as vcluster, muscatpharmacy-prod
as Environment, gitea.omantel.openova.io/muscatpharmacy/muscatpharmacy-prod
as Environment Gitea repo.
Refs #37
After the PLATFORM-TECH-STACK reorganization (§2 = Catalyst control
plane, §3 = per-host-cluster infrastructure), IMPLEMENTATION-STATUS
§2 was still mixing the two — listing cilium, k8gb, kyverno, falco,
etc. under "Catalyst control plane components" alongside console,
projector, etc.
Split into:
- §2 (renumbered subsections 2.1, 2.2): Catalyst control plane only —
the per-Sovereign components that make a cluster a Sovereign.
- §2bis: Per-host-cluster infrastructure — the substrate every host
cluster needs (Cilium, Flux, Crossplane, cert-manager, ESO, Kyverno,
Trivy, Falco, Sigstore, Syft+Grype, VPA, KEDA, Reloader, MinIO,
Velero, Harbor, failover-controller).
Status flags retained per component (📐 design / 🚧 README only / ✅
implemented / ⏸ deferred). All per-host-cluster components currently
🚧 (READMEs exist; none yet packaged as deployable Blueprints).
This brings IMPLEMENTATION-STATUS into 1:1 correspondence with the
PLATFORM-TECH-STACK §2 / §3 / §4 categorization that other docs
reference.
Refs #37
Pass 2 — fresh-eyes sweep across the entire docs tree. One residual
entity-noun usage found:
- platform/external-secrets/README.md:75 (in a Mermaid sequence
diagram): "Note over Wizard: Operator saves unseal keys offline"
— "Operator" used as person/entity. Renamed to "sovereign-admin"
to match the role from GLOSSARY.md.
All other banned-term sweeps clean:
- No tenant (architectural) anywhere.
- No Catalyst IDP anywhere.
- No Synapse-as-product anywhere (only the legitimate
"Matrix/Synapse server" usages).
- No workspace-controller (only the banned-term entries that define
the rename).
- No capital-W Workspace as Catalyst scope.
- No github.com/openova (without -io).
- All cross-doc Markdown links resolve.
- All §X references resolve to the new section numbering after
PLATFORM-TECH-STACK reorg.
- API group catalyst.openova.io/v1alpha1 consistent across 6 references.
- OCI artifact prefix `bp-` consistent across README, CLAUDE,
BLUEPRINT-AUTHORING, IMPLEMENTATION-STATUS.
Other "Operator" mentions intentionally retained (legitimate
technical usage):
- "External Secrets Operator (ESO)", "Trivy Operator" — K8s
Operator pattern (controllers), explicitly allowed by GLOSSARY.
- "Operator compatibility" in BUSINESS-STRATEGY's OpenShift migration
table — refers to compatibility with K8s Operators (the technology),
not as an entity/role.
Refs #37
README + CLAUDE.md (iter 9):
- README's "Build a Blueprint" section was contradicting itself: said
"A Blueprint is a Git repo" while elsewhere we'd locked in the
monorepo decision. Rewritten: Blueprint = a folder under
platform/<name>/ or products/<name>/ in this monorepo. CI publishes
per-folder OCI artifacts.
- CLAUDE.md "Repo structure": replaced the brief tree with a more
honest one that distinguishes target structure from current
placeholders (core/apps/ is target console+projector+...; current
has only legacy bootstrap/ and manager/ .gitkeep dirs). Annotated
each products/<name>/ folder with current state (axon = real code;
others = README only; catalyst = bootstrap/ui scaffold).
- CLAUDE.md banned-terms entry "Workspace": now covers component
names too (was only Catalyst scope), matching GLOSSARY's expanded
banned-term entry.
PLATFORM-TECH-STACK (iter 10) — substantive reorganization:
The §1 categorization established three buckets:
(a) Catalyst control plane (per-Sovereign on mgt)
(b) Per-host-cluster infrastructure (every host cluster)
(c) Application Blueprints (a la carte)
But §2 "Catalyst control plane components" was mixing buckets (a)
and (b): it listed flux, crossplane, cert-manager, kyverno, harbor,
external-secrets, reloader, vpa, keda, k8gb, coraza, falco, trivy,
sigstore, syft-grype, minio, velero, failover-controller all under
"Catalyst control plane" — but those are per-host-cluster
infrastructure per §1, and §1 itself said Crossplane "Never
user-facing" / per-host-cluster.
Reorganized §2 + §3:
- §2 now contains ONLY the Catalyst control plane:
2.1 User-facing surfaces (console, marketplace, admin)
2.2 Catalyst backend services (projector, catalog-svc, provisioning,
environment-controller, blueprint-controller, billing)
2.3 Per-Sovereign supporting services (keycloak, openbao, spire-
server, nats-jetstream, gitea, observability)
- New §3 Per-host-cluster infrastructure with subsections for
networking, GitOps+IaC, security+policy, scaling+ops, storage+
registry, resilience.
- Application Blueprints renumbered §3 → §4. Added missing
opensearch row to §4.1 (was previously misplaced in observability).
- Composite Blueprints (Products) §4 → §5.
- Multi-Region §5 → §6. Resource estimates §6 → §7. Cluster
deployment §7 → §8. User choice §8 → §9. SIEM §9 → §10. License §10 → §11.
Cross-doc references to PLATFORM-TECH-STACK §1 / §2 (in NAMING,
ARCHITECTURE, IMPLEMENTATION-STATUS) all still resolve correctly
under the new numbering.
SRE (iter 11):
- §2.4 split-brain table: "MongoDB" → "FerretDB" (MongoDB was
retired in favor of FerretDB-on-CNPG per project-memory).
- §2.5 data replication: clarified each row's layer (Application
Blueprint vs per-host-cluster vs Catalyst control plane) instead
of misclassifying MinIO/Harbor as Application Blueprints. Added
OpenSearch row.
- §3.1 Flagger and §3.2 Flipt: explicitly marked "Status: design,
not yet a deployed Blueprint" since they're "components to watch"
in TECHNOLOGY-FORECAST, not in the current PLATFORM-TECH-STACK §3
inventory.
BUSINESS-STRATEGY + TECHNOLOGY-FORECAST (iter 12):
- Final scan: clean. No tenant/operator-team/Catalyst-IDP/Lifecycle
Manager/Synapse(product) violations remaining.
Refs #37
SECURITY (iter 6):
- "Environment repo" → "Environment Gitea repo" in §3 secrets diagram.
- "ChangePolicy enforces approvals" → "EnvironmentPolicy enforces
approvals" in §9 SOC2 row (ChangePolicy was a fictional CRD —
EnvironmentPolicy is the real one defined in ARCHITECTURE §8).
- "Catalyst's compliance-controller surfaces evidence" → "evidence
surfaced via Catalyst console audit views and SIEM exports"
(compliance-controller wasn't defined elsewhere; this avoids
inventing new components in compliance prose).
SOVEREIGN-PROVISIONING (iter 7):
- "vault-stored" → "stored in OpenBao on the provisioner"
(Vault was replaced by OpenBao; "vault-stored" was generic English
but read as a contradiction).
BLUEPRINT-AUTHORING (iter 8):
- OCI artifact naming locked: `ghcr.io/openova-io/bp-<name>:<semver>`
where `<name>` is the folder name. The `bp-` prefix lives in the
OCI artifact name (self-identifying), not the folder name.
Fixed in §1, §10, §11, §13 — and propagated to README.md so the
pattern is consistent across the repo.
- Crossplane Composition example: `compositeTypeRef.apiVersion`
changed from `bp-wordpress.openova.io/v1alpha1` (per-Blueprint
group, ugly) to `compose.openova.io/v1alpha1` (shared XRD group
across all Blueprints).
- §11 CI pipeline final step: "publish blueprint.yaml as the
manifest" → "as the OCI manifest's metadata layer" (clearer about
what it does in the OCI sense).
Refs #37
GLOSSARY.md line-by-line audit. Eight corrections.
1. workspace-controller → environment-controller everywhere. The
controller reconciles the Environment CRD; "workspace" is banned as
a Catalyst scope, so it cannot be in a component name either. Fixed
in: GLOSSARY, ARCHITECTURE, PLATFORM-TECH-STACK, NAMING-CONVENTION,
SOVEREIGN-PROVISIONING, IMPLEMENTATION-STATUS, core/README,
BUSINESS-STRATEGY. Banned-term entry in GLOSSARY now explicitly
covers component names too.
2. "workspace repos" (per-Environment Gitea repos) → "Environment
Gitea repos" in GLOSSARY, PLATFORM-TECH-STACK.
3. JWT claim {workspace, org, role} → {environment, org, role} in
ARCHITECTURE projector diagram.
4. OpenOva definition refined: was "Never used to name a product",
which contradicted "OpenOva Catalyst", "OpenOva Cortex". Now: brand
prefix in product names; bare "OpenOva" = the company; bare
"Catalyst" = the platform.
5. Catalyst definition completed: was missing provisioning, billing,
gitea, observability — now lists all 14 control-plane components,
pointing at the table below.
6. Catalyst components table: added `provisioning` (validates
configSchema, commits to Environment Gitea); reordered to match
ARCHITECTURE §3 grouping; clarified each component's source-of-truth
(catalog-svc reads monorepo + Gitea, blueprint-controller watches
monorepo + Gitea, etc.).
7. Environment definition: refers to NAMING §2.4 for env_type values;
removed inline list that didn't match canonical ordering. Added
concrete examples (acme-prod, acme-dev, bankdhofar-uat).
8. Application example: dropped "RocketChat" which appeared nowhere
else; replaced with generic "running deployment" plus the
established WordPress / Postgres examples.
9. sovereign-admin description: was "runs Crossplane" — Crossplane is
platform plumbing not user-facing. Now: "manages the underlying
clusters via Crossplane (which is platform plumbing, not a
user-facing surface)".
Banned-term coverage:
- "Workspace" entry now covers BOTH the Catalyst scope AND component
naming (workspace-controller → environment-controller).
Refs #37
First validation iteration. Three concrete corrections.
1. Add docs/IMPLEMENTATION-STATUS.md as the bridge between target
architecture and current code state. Status legend (✅ / 🚧 / 📐 / ⏸)
applied per-component. Catalyst control plane = mostly 📐. Component
READMEs = 🚧 (README only, no Blueprint manifests yet). products/axon
= ✅ (only product with real code). core/ = 📐 (just .gitkeep).
2. Status banner added to ARCHITECTURE, SECURITY, SOVEREIGN-PROVISIONING,
BLUEPRINT-AUTHORING, PERSONAS-AND-JOURNEYS, PLATFORM-TECH-STACK, SRE
pointing readers at IMPLEMENTATION-STATUS.md before they treat any
described feature as built. GLOSSARY also references it.
3. Architectural decision (Option A — monorepo canonical):
- Each platform/<name>/ and products/<name>/ folder is the source of
ONE Blueprint, published as ghcr.io/openova-io/<name>:<semver> by
CI fan-out from the monorepo root.
- BLUEPRINT-AUTHORING.md §1, §2, §13 rewritten to match.
- README.md "what's in this repo" rewritten to clarify monorepo +
OCI-fan-out shape; no longer claims every directory is a Blueprint
in a way that contradicts BLUEPRINT-AUTHORING.
Wrong-org fixes (3 places):
- docs/PERSONAS-AND-JOURNEYS.md:13 github.com/openova → openova-io
- docs/BLUEPRINT-AUTHORING.md:13 github.com/openova → openova-io
- docs/BLUEPRINT-AUTHORING.md:404 github.com/openova → openova-io
- docs/BLUEPRINT-AUTHORING.md ghcr.io/openova/* (3 refs) → openova-io
API group consistency:
- All references unified to catalyst.openova.io/v1alpha1
(was mixed v1 / v1alpha1; v1alpha1 is correct since the CRDs are
design-stage with no implementation).
core/README.md updated to honestly describe the directory tree as
"target structure with .gitkeep placeholders" rather than implying
the apps/console, apps/projector, etc. binaries already exist.
The legacy apps/bootstrap and apps/manager directories are
acknowledged as transitional placeholders that will be removed when
the new apps/ layout is scaffolded.
CLAUDE.md and .claude/project-memory.md updated to put
IMPLEMENTATION-STATUS.md second in the read-first ordering.
Refs #37
Targeted updates to BUSINESS-STRATEGY.md §5.1 and §9.2 plus
TECHNOLOGY-FORECAST §removed-components.
- BUSINESS-STRATEGY.md §5.1: OpenOva Catalyst row repositioned. It is
the platform itself (the self-sufficient Kubernetes-native control
plane that turns any cluster into a Sovereign), not a sub-product
bundling bootstrap+IDP+lifecycle manager. Other OpenOva products
(Cortex, Fingate, Fabric, Relay, Specter, Axon) run ON Catalyst as
composite Blueprints.
- BUSINESS-STRATEGY.md §9.2: capability matrix "Developer portal" cell
updated from "Catalyst IDP" to "Catalyst console" — IDP function is
one of the console's responsibilities, not a separate product.
- TECHNOLOGY-FORECAST.md §removed-components: Backstage row updated to
describe replacement as "Catalyst console (the platform's own
developer-facing UI)" rather than the now-retired "Catalyst IDP"
sub-product.
Strategy narrative, market segmentation, pricing model, and migration
playbook are unchanged — they stand on their own.
Refs #37
Two related rewrites that put the control plane / application Blueprint
distinction front and center.
PLATFORM-TECH-STACK.md
- §1: explicit three-way component categorization — Catalyst control
plane (one per Sovereign), per-host-cluster infrastructure (every
cluster), Application Blueprints (inside per-Org vclusters).
- §2: Catalyst control plane components listed by responsibility —
user-facing surfaces, backend services, identity, secrets, event
spine, GitOps, networking, security, scaling, storage,
observability, resilience.
- §3: Application Blueprints (the a-la-carte catalog) — Valkey and
Strimzi explicitly callout that they are Application Blueprints,
NOT control-plane components (control plane uses NATS JetStream).
- §4: composite Blueprints (Cortex, Axon, Fingate, Fabric, Relay)
repositioned as Applications running ON Catalyst, not as parallel
products.
- §5: multi-region diagram showing independent OpenBao Raft per
region, NATS leaf nodes, Crossplane on mgt.
- §6: resource estimates updated for control plane (~12 GB +
per-Org Keycloak in SME tier).
- §10: license posture table — every control-plane component carries
a redistribution-safe license (no BSL).
SRE.md
- §2: multi-region principles updated; explicit "no stretched
clusters" applies to OpenBao, JetStream, etcd, every quorum-
based component.
- §2.5: data replication patterns now scoped to Application
Blueprints (the things a customer installs), separate from
control-plane patterns documented in SECURITY.md and
ARCHITECTURE.md.
- §4: alert-to-action mapping segmented by Catalyst control plane
vs per-product (Cortex, Fingate); new alerts: OpenBaoSealed,
JetstreamLagHigh.
- §7-§13: terminology aligned to Catalyst (console instead of IDP);
runbooks now Runbook CRD-backed; incident severities updated.
- §13.2-13.3: Catalyst-specific incidents (workspace-controller,
OpenBao seal, projector lag) plus AI Hub incidents under
bp-cortex installation.
Refs #37
Repositions the public repo's identity. OpenOva is the company; Catalyst
is the platform. Sovereign is a deployed Catalyst. The historical
positioning (OpenOva = platform, Catalyst = bootstrap+IDP+lifecycle
sub-product) is retired. Catalyst now subsumes bootstrap, lifecycle, and
IDP responsibilities into one control plane.
- README.md Catalyst-first front door. Sovereign concept,
repo structure, stack at a glance, cloud
provider matrix, getting-started paths
(managed via marketplace.openova.io vs
self-host via catalyst-provisioner).
- CLAUDE.md Codebase guide for Claude. Banned-term table,
commit conventions (hatiyildiz default for
public repo), the no-fourth-surface rule,
per-component README rule of thumb.
- .claude/project-memory.md Reduced to an index + decision log;
full architecture moved to docs/. Stack
decisions locked (NATS JetStream, OpenBao,
SPIFFE/SPIRE, per-Org Keycloak SME / per-
Sovereign corporate, Crossplane only IaC,
no Terraform/Pulumi user-facing surface).
- core/README.md Catalyst control-plane Go application. Drops
the bootstrap-vs-manager split (both fold under
"Catalyst control plane"). Lists each component
deployable from this codebase: console,
marketplace, admin, projector, catalog-svc,
provisioning, workspace-controller, blueprint-
controller, billing. CRD list updated:
Sovereign / Organization / Environment /
Application / Blueprint / EnvironmentPolicy /
SecretPolicy / Runbook.
Refs #37
The naming convention pre-dates vcluster and Catalyst's user-facing
Environment object. Three additions, one rename:
- §2.4: {env} dimension renamed to {env_type} to disambiguate from the
Catalyst Environment object (which is the user-facing scope, not a
dimension).
- §2.5: new Organization dimension (slug, lowercase, hyphenated). Used
for vcluster identity and any Organization-scoped resource.
- §4.7: new vcluster naming layer. Pattern is just {org} within the
parent host cluster (Don't Repeat the Parent — Principle 1.2). Globally-
qualified form is {prov}-{reg}-{bb}-{env_type}-{org} for cross-cluster
references and kubeconfig contexts.
- §11: Catalyst Environment defined as the user-facing {org}-{env_type}
scope. One Environment is realized by N vclusters across regions × bb
filtered by Application Placement. Each Environment has its own Gitea
repo and JetStream Account.
Tags updated: openova.io/environment → openova.io/env-type for
disambiguation; new openova.io/organization, openova.io/vcluster,
openova.io/environment (for Catalyst scope), openova.io/sovereign tags.
DNS pattern §5 split into two: control-plane (component.{location-code}.
{sovereign-domain}) and Application (app.{environment}.{sovereign-or-org-
domain}) — supporting white-label Sovereigns where the Application DNS
uses the customer's own domain.
Refs #37
Client sends `thinking: true` to enable reasoning tokens. Default remains
disabled for instant streaming.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Qwen3-coder generates hundreds of `reasoning` tokens before `content`
tokens, causing 10+ second perceived delay. The reasoning tokens stream
through Axon but the ChatWidget only renders `delta.content`, so users
see a long pause then a burst. Passing `enable_thinking: false` via
chat_template_kwargs skips the reasoning phase entirely.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3-turn conversations passed at ~9120 chars but 4-turn failed at ~10640.
WAF anomaly threshold is between those values. Lowered all limits to keep
multi-turn conversations well under the threshold.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>