Commit Graph

35 Commits

Author SHA1 Message Date
hatiyildiz
b2173ae13c docs(pass-60): valkey REPLICAOF bash example carry-over; NAMING fourth-cycle stable
FIRST drift in the new cycle. 6-consecutive-clean streak (54-59) ends
at Pass 60. However, drift is Pass-35 carry-over, not new architectural
drift — same "incomplete in-file fix" pattern as Pass 31 (openbao
L108 vs L127).

platform/valkey/README.md L79 had:
  REPLICAOF primary-valkey.region1.svc.cluster.local 6379

Pass 35 fixed L147 (StatefulSet --replicaof argument) to canonical
valkey.<env>.<sovereign-domain> per NAMING §5.2 but the bash command
example at L79 retained the older non-canonical form.

Fixed L79 to valkey.<env>.<sovereign-domain> matching L147.

Methodology lesson #18: Pass-N sweep grep patterns can miss carry-over
drift that doesn't match the sweep's specific shape. Pass 35 grep
targeted <domain> placeholders; L79 used a fully-qualified hostname
with no placeholder, evading the sweep.

NAMING-CONVENTION fourth-cycle deep re-read confirmed stable across
§1-§11. §4.1 "hfrp" location-code example is for rtz cluster (vs hfmp
for mgt) — both valid for different cluster types, not drift. §11
already settled across Pass 37, 42, 50.

valkey README banner explicitly establishes "NOT a Catalyst
control-plane component" (Pass 26 framing) — exemplary canonical.

Convergence: Pass 54-59 = 6 consecutive cleans (nirvana approach met).
Pass 60 carry-over fix resets streak but architectural integrity holds.
The new cycle audit is doing its job — surfacing carry-over drift the
old cycle's specific-shape sweeps missed.
2026-04-28 01:28:00 +02:00
hatiyildiz
9c3d370107 docs(pass-51): flink Strimzi namespace drift; SECURITY clean
platform/flink/README.md L137 + L166 used strimzi-kafka-bootstrap.messaging.svc
but canonical Catalyst namespace per strimzi README (L100/146/181/191) and
debezium (L135) is `databases`. Same Helm-default-vs-Catalyst-convention drift
as Pass 41 minio (minio-system → storage). Pass 51 sweep confirmed no other
component uses "messaging" as a Catalyst namespace — only generic English
usage and K8s API group messaging.knative.dev/v1.

Fixed both instances to strimzi-kafka-bootstrap.databases.svc:9093. Port
9093 (TLS) kept — port choice (9092 vs 9093) is a separate architectural
question deferred.

SECURITY.md re-scan with all current methodology lessons:
- §1-§5: clean. Independent-Raft-per-region principle intact.
- §6 Keycloak topology: clean.
- §7 Rotation policy: SecretPolicy uses canonical catalyst.openova.io/v1alpha1.
- §8 Path of a secret: clean.
- §9 Compliance posture: borderline OpenSearch SIEM wording re-evaluated;
  acceptable in context.
- §10 Threat model: clean.

Methodology note: Helm-default-namespace drift now found across 3 instances
(Pass 41 minio, Pass 51 flink). Add cross-component namespace verification
to standard checks.

Drift found. Consecutive-clean count resets from 2 (49→50) to 0.
2026-04-28 00:31:25 +02:00
hatiyildiz
67aab8f6c1 docs(pass-48): crossplane OpenTofu/XRD group drift; PERSONAS clean
platform/crossplane/README.md had three real drift items:

1. §"Terraform vs Crossplane" — Catalyst's canonical bootstrap IaC is
   OpenTofu (PTS §3.2 + SOVEREIGN-PROVISIONING §3), not Terraform.
   Renamed section to "OpenTofu vs Crossplane", added intro paragraph
   clarifying the OSS-fork rationale, updated table rows + Decision.

2. XRD CompositeResourceDefinition example used name: xdatabases.openova.io
   and group: openova.io. Per BLUEPRINT-AUTHORING §8 (Pass 42 verified
   canonical), Crossplane XRDs use compose.openova.io group — separate
   from Catalyst CRDs (catalyst.openova.io). Fixed to
   xdatabases.compose.openova.io / group: compose.openova.io with inline
   pointer to BLUEPRINT-AUTHORING §8.

3. Composition compositeTypeRef.apiVersion was openova.io/v1alpha1, fixed
   to compose.openova.io/v1alpha1. Also corrected Composition metadata.name
   to database.hcloud.compose.openova.io for naming consistency.

Pass 1's API group unification was Catalyst-CRDs-only; Pass 42 verified
the separate Crossplane group; Pass 48 catches a downstream consequence
where the crossplane README defaulted to bare `openova.io` matching
neither canonical form.

PERSONAS-AND-JOURNEYS §1-§7 deep re-scan: clean. Pass 22, 33, 39 fixes
all intact. Three-pass-touched doc reads consistently. Stable.

Banner already correctly enforces "platform plumbing, never user-facing"
per ARCHITECTURE §7.4 / GLOSSARY.
2026-04-28 00:10:48 +02:00
hatiyildiz
2a1d6f5d3f docs(pass-41): SOVEREIGN-PROVISIONING §4 + minio namespace drift across 3 components
SOVEREIGN-PROVISIONING.md §4 (Phase 1 Hand-off) "self-sufficient" list
had 6 items vs PLATFORM-TECH-STACK §2.3's 6 control-plane supporting
services. List was missing SPIRE (5-min rotating SVIDs — critical to
SECURITY model) and observability (Grafana stack — Catalyst's
self-monitoring). Same drift category as Pass 40: summary list drifted
independently from canonical reference. Added both, plus enumerated the
§2.1+§2.2 services in the "Catalyst control plane" bullet.

Mid-pass sweep finding: kserve L217 used minio.minio-system.svc but
canonical minio README declares namespace: storage (L70). Three other
components also used minio-system: milvus L78, harbor L145. Fixed all
three to align with canonical `storage` namespace per PLATFORM-TECH-STACK
§3.5. Drift likely came from Helm-chart upstream defaults.

platform/kserve substantively clean apart from namespace fix.

Pass 41 lesson: union-equality check applies to ALL summary passages in
canonical docs. When a passage enumerates items derived from a canonical
source list, count both and verify equality.
2026-04-27 23:21:19 +02:00
hatiyildiz
5744307027 docs(pass-38): surviving "fuse" namespace in temporal; SECURITY + grafana clean
Acceptance greps with Pass 37's new literal-domain check and case-insensitive
banned-term sweep found one surviving instance: platform/temporal/README.md
L272 Worker Deployment had `namespace: fuse`. Pass 26 renamed fuse → fabric;
Pass 32+35 fixed temporal's image ref and DNS but the namespace YAML key
was missed (eye tracks surrounding structure, skims past `namespace:` value).
Renamed to `fabric`.

docs/SECURITY.md: clean (deep re-scan §6-§10 per Pass 23 lesson). All
sections consistent with canonical model and Pass 7's independent-Raft fix.
§9 OpenSearch SIEM wording acceptable as "default destination when SIEM
is enabled" rather than "default-installed component" — deferred for
optional tightening pass.

platform/grafana/README.md: clean. Banner, tiered storage, and OTel
instrumentation example all consistent with canonical conventions.

Lesson: case-insensitive banned-term grep is non-negotiable. Future
passes should always run \bfuse\b and similar legacy-product-name greps
regardless of surfaced category.
2026-04-27 22:59:17 +02:00
hatiyildiz
76e68e6182 docs(pass-36): flux deep-scrutiny + sweep gap-fill (Pass 35 head -10 cutoff)
Pass 35's sweep grep had `head -10` cutoff that produced a false-clean
signal. Pass 36 ran the same grep without truncation, finding 6 surviving
drift instances:

platform/flux/README.md (5 fixes):
- Mermaid diagram: Tenant[Tenant Repos] -> Organization[Organization Repos].
- GitRepository url gitea.<domain> -> gitea.<location-code>.<sovereign-domain>.
- Bootstrap command --url=https://gitea.<domain>/... -> canonical form.
- Key commands `flux reconcile kustomization tenants` -> `organizations`
  (Pass 34 was uppercase-only and missed lowercase plural).
- Gitea Actions example flux-webhook.<domain> -> location-code form.

platform/kyverno/README.md (1 fix):
- Mermaid subgraph "Tenant Workload" -> "Organization Workload"
  (the priority class names tenant-high/tenant-default remain — those
  are deployed K8s PriorityClass objects requiring recreate-not-rename
  per Pass 9's deferred-migration note).

Methodology lesson: convenience shortcuts in validation produce false-clean
signals. From Pass 37 forward: drift sweeps use full grep output (no
truncation) and case-insensitive banned-term searches.

Validation log Pass 36 entry includes detail on each preserved
"multi-tenant" generic adjective use that survived (acceptable feature
descriptions, not Catalyst entity references).
2026-04-27 22:49:05 +02:00
hatiyildiz
bc9b90d989 docs(pass-35): completion sweep for surviving DNS placeholders (8 components)
Started as gitea + relay atomic check. The gitea fix surfaced surviving
<domain> placeholders across 8 other component READMEs that prior sweeps
(Pass 29: canonical docs, Pass 32: image registries) hadn't covered.

Catalyst control-plane DNS fixes (-> {component}.<location-code>.<sovereign-domain>):
- gitea: GITEA_INSTANCE_URL.
- external-secrets: openbao ClusterSecretStore + gitea Flux GitRepository.

Application DNS fixes (-> {app}.<env>.<sovereign-domain>):
- temporal: had two drift items in one line — temporal.fuse.<domain>
  (old "fuse" product name + wrong placeholder shape). Pass 32 fixed
  the image ref on the same file but missed this. Now fully de-drifted.
- valkey: --replicaof valkey.region1.<domain> (non-canonical region1
  segment — Catalyst encodes regions in location-code).
- strimzi: kafka-kafka-bootstrap.region1.<domain>:9092 — same.
- cnpg: postgres.region1.<domain> cross-region replica host — same.
- stunner: STUN/TURN realm — kept canonical Application form for
  consistency even though STUN realms are nominally opaque.
- k8gb: Gslb ingress host app.gslb.<domain> -> app.gslb.<sovereign-domain>.
  Other illustrative k8gb refs (dnsZone, nslookup examples) preserved
  as they describe behavior generically.

products/relay/README.md: clean.

Preserved as correctly-generic: external-dns illustrative refs,
cert-manager <domain> (customer-supplied cert names), stalwart <domain>
(customer email-receiving domain).

Validation log Pass 35 entry: third end-to-end DNS sweep iteration
(29 -> 32 -> 35). Future passes should grep for bare <domain> early to
catch new instances introduced during edits.
2026-04-27 22:46:16 +02:00
hatiyildiz
70fea3ab8f docs(pass-34): banned-term TENANT sweep + keycloak hostname drift
GLOSSARY's banned term "tenant" survived in Configuration tables and Flux
postBuild substitutions across product READMEs as ${TENANT} (uppercase
ENV var). Prior banned-term greps searched lowercase `tenant` so the
ALL-CAPS form slipped through.

Product README fixes:
- products/cortex: TENANT/DOMAIN → ORGANIZATION/SOVEREIGN_DOMAIN, plus
  two DNS placeholder fixes for llm-gateway and chat URLs (same shape
  Pass 25/31 fixed elsewhere).
- products/fingate: 6 instances (Flux substitution, Configuration table,
  4 URL templates) renamed. URL shape api.openbanking.<org>.<sov-dom>
  flagged as 4-segment FQDN that doesn't match NAMING §5.1 or §5.2 —
  deferred to a deeper architectural pass.
- products/fabric: Configuration table row renamed.

Component README:
- platform/keycloak: shared-sovereign hostname auth.<sovereign-domain>
  and per-organization auth.<org>.<sovereign-domain> both missing
  <location-code> per NAMING §5.1. Fixed.

platform/librechat ${TENANT_ID} preserved — that's Microsoft Azure AD
tenant-ID (external technology, exempted by GLOSSARY).

Validation log Pass 34 entry includes meta-note: always run a global
grep for the surfaced drift category before closing a pass, to avoid
the asymmetric-drift problem Pass 25 warned against.
2026-04-27 22:42:50 +02:00
hatiyildiz
4043e1d51c docs(pass-32): registry-DNS sweep — harbor.<domain> across 9 component READMEs
Pass 25's deferred sweep, executed. Image refs of the form
harbor.<domain>/... (and one registry.<domain>/... in temporal) collapse
the location-code segment. Per NAMING §5.1, Catalyst per-host-cluster
Harbor DNS is harbor.{location-code}.{sovereign-domain} (e.g.
harbor.hfmp.openova.io).

Fixed (11 instances, 9 files):
- anthropic-adapter, bge (×2), debezium, harbor (×2 — ingress + Kyverno
  policy), knative (×2 — serving + traffic-split), llm-gateway, strimzi,
  trivy — all standardized to harbor.<location-code>.<sovereign-domain>.
- temporal had two drift items in one line: registry.<domain> (off-spec
  placeholder — Catalyst's only per-host-cluster registry is Harbor) AND
  legacy "fuse" namespace (renamed to bp-fabric per BUSINESS-STRATEGY
  §16.2 / Pass 26). Rewritten to fabric/order-worker.

Out of scope (deliberate): :latest tag hygiene, and whether Application
Blueprint READMEs should reference ghcr.io/openova-io/bp-<name>:<semver>
vs the Sovereign Harbor mirror. Stalwart customer-email-domain <domain>
placeholders preserved (correct semantics). external-dns illustrative
gslb/api/svc.<domain> preserved (upstream-doc generic).

With Pass 29 (canonical-doc DNS) + Pass 31 (carry-over fixes) + Pass 32
(image registry), the recurring DNS-placeholder collapse drift category
is addressed end-to-end.

Validation log Pass 32 entry added.
2026-04-27 22:36:39 +02:00
hatiyildiz
3993f5fc31 docs(pass-31): openbao + librechat DNS-placeholder carry-over fixes
platform/openbao/README.md ingress hosts (line 108) had `bao.<domain>` while
the same file's ClusterSecretStore example (line 127) used the canonical
`bao.<location-code>.<sovereign-domain>` form. Pass 7's active-active fix
addressed the body but missed the ingress placeholder. Aligned with the
canonical form.

platform/librechat/README.md OAuth callback (line 154) had
`chat.ai-hub.<domain>/oauth/openid/callback` — same Application-endpoint
shape Pass 25 fixed in llm-gateway. Pass 22 marked the file clean and Pass
29 fixed the Keycloak issuer line but didn't re-sweep. Per NAMING §5.2
Application endpoints are `{app}.{environment}.{sovereign-domain}`. Fixed.

docs/GLOSSARY.md verified clean — single-source-of-truth has held across
the loop (Pass 6/7/14/20/22/26/27 all consistent with current GLOSSARY).

Validation log Pass 31 entry includes meta-note: third file (librechat)
that needed re-opening after a "clean" mark — banner scans miss YAML-block
drift. Future passes should default to a full placeholder-shape grep on
every file touched.
2026-04-27 22:34:10 +02:00
hatiyildiz
4793cab8b6 docs(pass-29): DNS-placeholder sweep across canonical docs
The recurring drift: Catalyst control-plane DNS placeholders that omit the
<location-code> segment, producing forms like gitea.<sovereign>,
gitea.<sovereign>.<domain>, gitea.<sovereign-domain>, keycloak.<domain>.
Per NAMING §5.1 the canonical form is
{component}.{location-code}.{sovereign-domain} (e.g. gitea.hfmp.openova.io).
The shorter forms aren't just abbreviations — they collapse the multi-region
location dimension and re-drift every time a reader reads them as obvious
shorthand.

Fixes:
- CLAUDE.md "Customer Sync" — both gitea.<sovereign>/catalog/... lines.
- docs/SOVEREIGN-PROVISIONING.md §3 DNS-records bullet (3 lines) + §5
  Day-1 login line.
- docs/ARCHITECTURE.md §4 write-path Gitea label.
- docs/BLUEPRINT-AUTHORING.md §6.4 private-Blueprint Studio target.
- platform/librechat/README.md Keycloak issuer (Pass 22 marked clean and
  missed this — banner scans miss YAML-block drift).

platform/nemo-guardrails/README.md verified clean.

Final grep confirms only canonical forms remain. Validation log Pass 29
entry added with the recurring-drift-pattern note for future passes.
2026-04-27 22:30:41 +02:00
hatiyildiz
2c886daa52 docs(pass-25): llm-gateway DNS placeholders + IMPLEMENTATION-STATUS clean
platform/llm-gateway/README.md had three malformed DNS placeholders:
- KEYCLOAK_URL collapsed location-code + sovereign-domain into <domain> and
  used Application namespace `ai-hub` as a Keycloak realm name. Per NAMING §7
  and SECURITY §7, Keycloak realms are per-Org in SME-style or per-Sovereign
  in corporate-style — never per-Application-namespace. Fixed to
  `keycloak.<location-code>.<sovereign-domain>/realms/<org>`.
- ANTHROPIC_BASE_URL and `claude config set api_base` examples used
  `llm-gateway.ai-hub.<domain>/v1` — but NAMING §5.2 establishes
  Application endpoints as `{app}.{environment}.{sovereign-domain}`.
  Fixed to `llm-gateway.<env>.<sovereign-domain>/v1`.

docs/IMPLEMENTATION-STATUS.md confirmed clean: CRD list, surfaces, and
control-plane component list all match canonical docs.

Sweep concern logged for `harbor.<domain>` / `:latest` image patterns
appearing across many platform READMEs — to be addressed in a dedicated
sweep pass rather than asymmetrically here.

Validation log Pass 25 entry added.
2026-04-27 22:22:32 +02:00
hatiyildiz
5f028d1b6a docs(pass-20): SOVEREIGN-PROVISIONING placement YAML + Kyverno label drift
Pass 20 — drift-detection on SOVEREIGN-PROVISIONING + platform/kyverno.
Two real findings.

SOVEREIGN-PROVISIONING.md §8:
- "Existing Applications with `placement: active-active: false,
  single-region` do not migrate automatically" — invalid YAML
  mixing a boolean with an enum. The canonical placement model
  (per GLOSSARY) has `placement.mode: single-region | active-
  active | active-hotstandby`, no boolean toggle.
- Rewrote: "Existing Applications with `placement.mode: single-
  region` ... user explicitly switches Placement to active-active
  (or active-hotstandby) and adds the new region to
  placement.regions".

platform/kyverno/README.md:
- Policy V5 (minimum-replicas-production) targeted namespaces
  labeled `openova.io/env: production` — out-of-spec label name
  AND value. NAMING-CONVENTION §6 establishes `openova.io/env-type:
  prod` (hyphen-form, short value).
- Fixed to `openova.io/env-type: prod`.

Both findings show the same pattern: schema-level details that
survive grep-based banned-term checks but contradict the canonical
spec when read in body.

VALIDATION-LOG: Pass 20 entry added.

Refs #37
2026-04-27 22:06:24 +02:00
hatiyildiz
b467dc3f3b docs(pass-18): NAMING DR-as-env_type misexample + Keycloak deployment topology
Pass 18 — drift-detection on NAMING-CONVENTION + platform/keycloak.
Two real findings.

NAMING-CONVENTION §11.1:
- The example list of Catalyst Environments included `bankdhofar-dr`
  — but `dr` is NOT a valid env_type. Canonical values per §2.4 are
  prod / stg / uat / dev / poc. DR is a Placement mode
  (active-active / active-hotstandby across regions inside the
  *-prod Environment), not a separate Environment.
- Replaced `bankdhofar-dr` with `bankdhofar-uat` and added an
  explicit "DR is a Placement, not an Env Type" note.

platform/keycloak/README.md:
- Keycloak Deployment YAML example used `namespace: open-banking`
  with 2 replicas — Fingate-specific narrative that contradicted
  the per-Org / per-Sovereign topology stated in the banner.
  Rewrote with two side-by-side examples:
  * shared-sovereign (3 HA replicas, catalyst-keycloak namespace,
    CNPG-backed)
  * per-organization (1 replica in <org> namespace, optional
    embedded DB for smallest SME tier)
- HA section was a single set of claims (2+ replicas, CNPG, Infinispan)
  that only matched corporate. Now branches on topology — corporate
  gets HA + Infinispan, SME gets single replica with restart-on-
  deploy as acceptable for tier SLAs.

Same kind of drift Pass 17 caught in Harbor: banner says one thing,
body still describes the older model. Both fixed.

VALIDATION-LOG: Pass 18 entry added.

Refs #37
2026-04-27 22:00:42 +02:00
hatiyildiz
eff264b077 docs(pass-17): ARCHITECTURE OAM table pipe-fix + Harbor README de-drift
Pass 17 — drift-detection sweep on ARCHITECTURE + harbor. Two real
findings.

ARCHITECTURE §13 (OAM table):
- `| Trait | Blueprint overlay (`overlays/small|medium|large`) |`
  has pipe chars inside backticks inside a Markdown table cell —
  a known GFM rendering hazard. Replaced with comma-separated
  examples.

platform/harbor/README.md:
- The banner added in Pass 9 said "every host cluster runs a
  Harbor instance" but the body still described an older
  "Harbor Primary / Harbor Replica" cross-region replication
  topology. Same shape of architectural drift Pass 7 caught in
  OpenBao/ESO/Gitea/Flux — banner-add doesn't rewrite the body.
- Three sections rewritten:
  * Overview mermaid: now shows upstream-OCI → multiple
    independent per-cluster Harbors with local Trivy scan + local
    Pod pulls.
  * "Multi-Region Replication" → "Per-host-cluster mirroring (NOT
    primary-replica)". Single source of truth = upstream OCI
    (ghcr.io/openova-io/* for Catalyst+Blueprints, customer CI for
    application images), not a "primary Harbor".
  * Example replication policy: was a `dest_registry` cross-region
    push policy → now a pull-mirror policy from ghcr.io with
    scheduled-cron trigger.
- "Why Mandatory" table reframed in per-host-cluster terms.

VALIDATION-LOG: Pass 17 entry added with the specific drift-detection
lesson — banner-addition passes don't catch body-level drift; need
explicit body re-reads.

Refs #37
2026-04-27 21:58:53 +02:00
hatiyildiz
b6a374df26 docs(pass-15): final banner sweep — 52/52 platform components covered, convergence achieved
Pass 15 swept all 52 platform/*/README.md files for the role-in-
Catalyst banner. 3 still lacked one (cnpg, flux, strimzi) and got
banners added:

- cnpg (§4.1): production Postgres; underlying engine for FerretDB +
  Gitea metadata.
- flux (§3.2): per-vcluster Flux + host-level Flux for Catalyst
  itself; pulls from single per-Sovereign Gitea.
- strimzi (§4.1): Application-tier event streaming; NOT the Catalyst
  control-plane spine (which uses NATS JetStream). Same upstream-
  tech-different-tier disambiguation pattern as Valkey.

CONVERGENCE: 52 / 52 platform components have role-in-Catalyst
banners. All cross-refs resolve. No banned terms. No architectural
drift detected on this pass.

VALIDATION-LOG: Pass 15 entry + "Convergence achieved (initial
banner sweep)" marker added. The validation loop continues per
the standing instruction — but subsequent passes will be brief
drift-detection sweeps rather than systematic rewrites.

Refs #37
2026-04-27 21:53:27 +02:00
hatiyildiz
9b3211fdee docs(pass-14): banners on workflow / analytics / metering / chaos / valkey (7 components)
Seven more Application Blueprint banners landed:

- temporal (§4.3): durable workflow orchestration; bp-fabric.
- flink (§4.3): stream + batch processing; bp-fabric.
- debezium (§4.2): CDC into Strimzi/Kafka; bp-fabric pipeline source.
- iceberg (§4.4): open table format on MinIO + archival S3.
- openmeter (§4.8): API metering for bp-fingate.
- litmus (§4.9): chaos engineering required by DORA / NIS2.
- valkey (§4.1): banner explicitly states NOT a Catalyst control-
  plane component — control plane uses NATS JetStream KV per
  ARCHITECTURE §5 / GLOSSARY event-spine. Valkey is Application-tier
  caching only. This is the disambiguation that PLATFORM-TECH-STACK
  §1 establishes ("same upstream technology can serve in multiple
  categories") — pinned in the per-component README so it can't be
  misread.

VALIDATION-LOG: Pass 14 entry added.

Refs #37
2026-04-27 21:52:03 +02:00
hatiyildiz
b021aaa57e docs(pass-13): role-in-Catalyst banners on 4 Communication Application Blueprints
All 4 communication components (composing under bp-relay) got role-
in-Catalyst banners pointing at PLATFORM-TECH-STACK §4.5:

- stalwart: JMAP/IMAP/SMTP self-hosted email.
- livekit: WebRTC SFU for video/audio/data; pairs with STUNner.
- stunner: K8s-native TURN/STUN for WebRTC NAT traversal.
- matrix: Matrix protocol via Synapse server. Banner explicitly
  disambiguates "Synapse" as the chat-server implementation, NOT
  the deprecated OpenOva product noun (retired in favor of bp-axon).

All 4 are explicitly Application Blueprints, NOT Catalyst control
plane.

VALIDATION-LOG: Pass 13 entry added.

Refs #37
2026-04-27 21:50:05 +02:00
hatiyildiz
9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints
All AI/ML component READMEs got banners pointing at PLATFORM-TECH-
STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting
composition under bp-cortex (composite AI Hub Blueprint):

- knative: serverless for KServe-managed inference.
- kserve: K8s-native model serving for vLLM, BGE, custom.
- vllm: default LLM inference runtime.
- milvus: vector database for RAG retrieval.
- neo4j: knowledge-graph-augmented retrieval alongside Milvus.
- librechat: default chat surface, fronts LLM Gateway via Guardrails.
- bge: embedding generation + reranking.
- llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon).
- anthropic-adapter: OpenAI-SDK → Anthropic translation.
- nemo-guardrails: AI safety firewall.
- langfuse: LLM observability (latency, tokens, cost, eval).

All 11 are explicitly Application Blueprints — NOT Catalyst control
plane. Catalyst's own observability stack (Grafana/OTel) covers
infrastructure; LangFuse covers AI-specific dimensions
(prompt/response/eval).

VALIDATION-LOG: Pass 12 entry added.

Refs #37
2026-04-27 21:47:45 +02:00
hatiyildiz
e9514b410d docs(pass-11b): retry banners on failover-controller/trivy/clickhouse/ferretdb (Edit needed Read first) 2026-04-27 21:45:56 +02:00
hatiyildiz
ae540269c4 docs(pass-11): banners on 7 more components + MinIO ILM label disambiguation
7 more component READMEs got role-in-Catalyst banners:

Per-host-cluster infrastructure:
- minio (§3.5): S3 fast-tier; tiers cold to cloud archival.
- velero (§3.5): K8s backup to archival S3 (NOT MinIO — that's
  fast-tier; backups land in cloud archival).
- failover-controller (§3.6): lease-based split-brain protection
  layered on k8gb; pointers to SRE §2.4 (witness pattern) +
  SECURITY §5.2 (OpenBao DR promotion).
- trivy (§3.3): CI + registry + runtime scan chain.

Application Blueprints (NOT control plane):
- opensearch (§4.1): explicitly framed as Application Blueprint —
  installed when an Org wants SIEM / full-text search / log analytics.
- clickhouse (§4.1): used by bp-fabric and SIEM cold-storage tier.
- ferretdb (§4.1): replication piggybacks on underlying CNPG.

MinIO ILM disambiguation:
- The Mermaid diagram had `ILM[Lifecycle Manager]` — confusable with
  the rejected Catalyst sub-product (per banned-terms list).
  Relabeled to `ILM[Information Lifecycle Manager - MinIO ILM]` to
  make clear it's MinIO's own feature, not the deprecated Catalyst
  Lifecycle Manager noun.

VALIDATION-LOG: Pass 11 entry added.

Refs #37
2026-04-27 21:45:28 +02:00
hatiyildiz
5834daec14 docs(pass-10): banners on 7 more components + opentofu active-active drift fix
7 more component READMEs got role-in-Catalyst banners:

- vpa, keda, reloader → per-host-cluster scaling/ops layer (§3.4).
  Reloader specifically calls out its role in Catalyst's secret-
  rotation flow (rolling deploy on K8s Secret hash change).
- external-dns → per-host-cluster DNS-sync (§3.1); pairs with k8gb
  for the GSLB zone separation.
- coraza → DMZ-block WAF on every host cluster (§3.1).
- crossplane → per-Sovereign on the management cluster (§3.2);
  banner explicitly emphasizes the agreed "never a user-facing
  surface" rule (Users don't write Compositions in Application
  configs; Blueprint authors and advanced contributors do). Cross-
  references the no-fourth-surface clause in ARCHITECTURE §4/§7
  and the Crossplane Composition section in BLUEPRINT-AUTHORING §8.
- opentofu → repositioned as Phase-0-only, runs on `catalyst-
  provisioner` only, NOT installed on host clusters at runtime.

opentofu drift fixes (uncovered by line-by-line read):
- Section 5 line 182: "Bootstrap Wizard prompts for cloud credentials"
  → "Catalyst Bootstrap (Phase 0) prompts for cloud credentials"
  (banned term).
- Same section line 186: "ESO PushSecrets sync to both regional
  OpenBao instances" — the active-active drift Pass 7 corrected
  elsewhere, still here. Replaced with "writes go to the primary
  OpenBao region only; replicas pick up via async perf replication".

VALIDATION-LOG: Pass 10 entry added.

Refs #37
2026-04-27 21:43:45 +02:00
hatiyildiz
a52bda30cb docs(pass-9b): retry banners on harbor / falco / sigstore / syft-grype
Pass 9's commit ea81c38 only landed banners on grafana + kyverno —
the harbor / falco / sigstore / syft-grype edits failed because the
Edit tool requires a Read pass per file before write. Now Read'd
and applied:

- harbor: per-host-cluster registry, pointer to PLATFORM-TECH-STACK §3.5.
- falco: per-host-cluster runtime security, pointer to §3.3 + SRE §10
  (SIEM/SOAR pipeline).
- sigstore: cosign signing chain on every Blueprint OCI artifact,
  Kyverno admission verifies signatures.
- syft-grype: CI-side SBOM + runtime CVE matching.

Pass 9 now complete.

Refs #37
2026-04-27 21:41:22 +02:00
hatiyildiz
ea81c38e15 docs(pass-9): role-in-Catalyst banners on grafana / harbor / falco / kyverno / sigstore / syft-grype
Pass 9 — six more component READMEs got Catalyst-role banners
matching the rule of thumb in CLAUDE.md (every platform/<x>/README.md
should state its role in Catalyst).

- grafana: observability stack on every host cluster; Catalyst's
  own self-monitoring + Application telemetry flows here.
- harbor: per-host-cluster container registry for Catalyst images,
  mirrored Blueprint OCI artifacts, customer images.
- falco: runtime security on every host cluster; feeds SIEM/SOAR.
- kyverno: policy engine on every host cluster; enforces Catalyst
  policy contracts (cosign on Blueprints, default-deny NetworkPolicies
  on Organization namespaces, priority-class injection).
- sigstore: cosign-signed Blueprint OCI artifacts + admission
  verification chain on every host cluster.
- syft-grype: SBOM generation in CI per Blueprint + runtime CVE scans.

Plus Kyverno priority-class clarification: prose around `tenant-high`
/ `tenant-default` / `tenant-batch` priority class names now reads
"Organization workloads" instead of "tenant workloads", with an
explicit note that the priority class artifact names themselves stay
as-is until a separate migration ticket renames them in deployed
clusters (renaming PriorityClass objects requires recreate, not
in-place rename).

VALIDATION-LOG: Pass 9 entry added.

Refs #37
2026-04-27 21:40:51 +02:00
hatiyildiz
14ed84de41 docs(pass-8): role-in-Catalyst banners + dead-link fix in component READMEs
Pass 8 — line-by-line read of platform/cnpg, platform/strimzi,
platform/k8gb, platform/keycloak, platform/cert-manager, platform/cilium.

CNPG and Strimzi: read in full and confirmed clean — they correctly
position themselves as Application Blueprints and don't drift from
the canonical model. CNPG's `<org>-postgres-dr` cluster name
(Application-tier database role) is acceptable per NAMING-CONVENTION
§1.3 (which only forbids primary/dr in K8s host-cluster names, not
in Application-internal CRD names).

Four READMEs updated:

k8gb:
- Header reframed: per-host-cluster infrastructure pointer to
  PLATFORM-TECH-STACK §3.1 and SRE §2.4 split-brain protection.
- Removed dead link to ../failover-controller/docs/ADR-FAILOVER-
  CONTROLLER.md (the failover-controller folder has no docs/);
  replaced with link to that component's README + SRE §2.4.

keycloak:
- Header reframed from "FAPI Authorization Server for Open Banking"
  (narrow) to "User identity for Catalyst Sovereigns" (broad).
  Keycloak handles ALL user identity in Catalyst, not just FAPI.
- Added per-Org / per-Sovereign topology callout matching SECURITY
  §6. Clarified that "Multi-tenant TPP" refers to PSD2 Third Party
  Providers, not Catalyst's Organization-level multi-tenancy.
- FAPI features kept since Keycloak still serves Fingate as the
  FAPI Authorization Server.

cert-manager:
- Header reframed as per-host-cluster infrastructure with pointer
  to PLATFORM-TECH-STACK §3.3.

cilium:
- Header reframed as per-host-cluster infrastructure with pointer
  to PLATFORM-TECH-STACK §3.1, including the install-first note
  (CNI must come before any other workload during Phase 0).

VALIDATION-LOG: Pass 8 entry added.

Refs #37
2026-04-27 21:39:03 +02:00
hatiyildiz
a5ffa1a716 docs(pass-7): align Gitea + Flux multi-region story; fix broken mermaid id
Continuing Pass 7 cleanup after the OpenBao/ESO rewrite (42aeb62).

Gitea README:
- Was describing "Bidirectional mirroring for multi-region" with two
  Gitea instances mirroring repos cross-region. Wrong: Catalyst's
  agreed model has one Gitea per Sovereign on the management cluster
  (PLATFORM-TECH-STACK §2.3). Replaced the multi-region mirror
  diagram with a single-Gitea + intra-cluster HA topology and added
  a "Why not cross-region bidirectional mirror" explainer (write-
  conflict semantics would break EnvironmentPolicy enforcement).
- Status banner: notes the canonical references.
- Backup section: removed "Repository mirror for redundancy"
  (replaced with Velero scheduled backups).

Flux README:
- "Multi-Region GitOps" section was showing one Gitea per region
  with bidirectional mirror. Replaced with one Gitea per Sovereign
  topology. Per-vcluster Flux pulls from this single Gitea.

Mermaid syntax bug:
- Earlier mass replace_all of "Catalyst IDP" → "Catalyst console"
  had left an invalid mermaid node identifier
  `Catalyst console[Catalyst console]` (mermaid forbids spaces in
  node IDs). Fixed to `Console[Catalyst console]`. Would have
  rendered as a broken diagram on GitHub.

VALIDATION-LOG: Pass 7 entry added documenting the OpenBao/ESO
active-active rewrite (the most consequential drift fix in any pass).

Refs #37
2026-04-27 21:36:20 +02:00
hatiyildiz
42aeb629bb docs(pass-7): rewrite OpenBao + ESO READMEs to match agreed multi-region semantics
Pass 7 — line-by-line read of platform/openbao/README.md and
platform/external-secrets/README.md found a major architectural drift:
both files described an OLD active-active bidirectional sync model
that contradicts docs/SECURITY.md §5 (the canonical reference).

The active-active design was rejected during the architecture session
because it would have been a stretched cluster — a single region's
network blip would block writes everywhere. The agreed model is:

- Independent Raft cluster per region (intra-region quorum only).
- Single-primary writes; replicas accept reads only.
- Async Performance Replication primary → replicas (lag <1s typical).
- Explicit DR promotion (sovereign-admin or failover-controller).

Fixes:

platform/openbao/README.md:
- Overview: removed "active-active deployments" / "either region can
  update secrets". Replaced with "independent Raft cluster per region",
  "asynchronous Performance Replication".
- Architecture diagram: replaced bidirectional-push diagram with the
  primary→replicas async perf replication topology that matches
  SECURITY.md §5.
- ClusterSecretStores: simplified from "two stores (local+remote)" to
  "one local store"; reads always pull locally.
- Renamed "PushSecret (Bidirectional)" → "Writes go to the primary
  region" with a single-target PushSecret pointing at bao-primary.
- Added DR promotion section pointing at SECURITY.md §5.2.
- Status banner: notes that the canonical multi-region reference is
  SECURITY.md.

platform/external-secrets/README.md:
- Header line: repositioned as per-host-cluster infrastructure with
  pointer to PLATFORM-TECH-STACK §3.3.
- Removed broken link to non-existent ../openbao/docs/ADR-OPENBAO.md
  (replaced with link to ../openbao/README.md).
- "Multi-region sync | Push to both OpenBao instances simultaneously"
  → "Multi-region reads | Async perf replication".
- "PushSecret to Multiple OpenBao Instances" example was writing to
  two ClusterSecretStores in parallel — replaced with single-target
  primary write.
- "Multi-region sync via single PushSecret" in Consequences →
  "Cross-region availability via Performance Replication".
- Mermaid sequence diagram: "Bootstrap Wizard" actor → "Catalyst
  Bootstrap (Phase 0)"; "Terraform" → "OpenTofu"; ESO connection
  description "via K8s auth" → "via SPIFFE SVID (workload identity)".

These were the most consequential drift fixes found in any pass —
two READMEs were documenting an architecture explicitly rejected by
the agreed model.

Refs #37
2026-04-27 21:34:09 +02:00
hatiyildiz
d6a51b8a7a docs(pass-2): final entity-noun sweep — external-secrets sequence diagram
Pass 2 — fresh-eyes sweep across the entire docs tree. One residual
entity-noun usage found:

- platform/external-secrets/README.md:75 (in a Mermaid sequence
  diagram): "Note over Wizard: Operator saves unseal keys offline"
  — "Operator" used as person/entity. Renamed to "sovereign-admin"
  to match the role from GLOSSARY.md.

All other banned-term sweeps clean:
- No tenant (architectural) anywhere.
- No Catalyst IDP anywhere.
- No Synapse-as-product anywhere (only the legitimate
  "Matrix/Synapse server" usages).
- No workspace-controller (only the banned-term entries that define
  the rename).
- No capital-W Workspace as Catalyst scope.
- No github.com/openova (without -io).
- All cross-doc Markdown links resolve.
- All §X references resolve to the new section numbering after
  PLATFORM-TECH-STACK reorg.
- API group catalyst.openova.io/v1alpha1 consistent across 6 references.
- OCI artifact prefix `bp-` consistent across README, CLAUDE,
  BLUEPRINT-AUTHORING, IMPLEMENTATION-STATUS.

Other "Operator" mentions intentionally retained (legitimate
technical usage):
- "External Secrets Operator (ESO)", "Trivy Operator" — K8s
  Operator pattern (controllers), explicitly allowed by GLOSSARY.
- "Operator compatibility" in BUSINESS-STRATEGY's OpenShift migration
  table — refers to compatibility with K8s Operators (the technology),
  not as an entity/role.

Refs #37
2026-04-27 21:18:55 +02:00
hatiyildiz
119a1e53a0 docs(components): terminology pass across platform and product READMEs
Bring per-component READMEs in line with the canonical glossary
(docs/GLOSSARY.md). Substantive architectural content unchanged —
this is a terminology + reference correctness pass.

Placeholder rename: <tenant> → <org> in YAML / IaC examples across
- platform/cnpg/README.md           (Cluster + Pooler + ScheduledBackup)
- platform/debezium/README.md       (PostgreSQL connector + topic patterns)
- platform/external-secrets/README.md (ExternalSecret / SecretStore)
- platform/grafana/README.md        (Instrumentation namespace)
- platform/k8gb/README.md           (Gslb + namespace + kubectl examples)
- platform/keda/README.md           (ScaledObject + Kafka triggers + Prometheus)
- platform/opentofu/README.md       (server resource example)
- platform/velero/README.md         (BackupStorageLocation buckets)
- platform/vpa/README.md            (VerticalPodAutoscaler examples)
- platform/flux/README.md           (kustomization name + tenants/ → organizations/)

"Catalyst IDP" → "Catalyst console":
- platform/crossplane/README.md     (integration section retitled and
                                      rewritten — Crossplane is platform
                                      plumbing, not user-facing)
- platform/gitea/README.md          (architecture diagram + integration table)
- platform/kyverno/README.md        (rollout tracking surface)
- products/fingate/README.md        (TPP onboarding portal)

"Bootstrap wizard" → "Catalyst bootstrap":
- platform/openbao/README.md        (bootstrap procedure rewritten —
                                      independent Raft per region clarified;
                                      cross-references docs/SECURITY.md §5)
- platform/opentofu/README.md       (Quick Start)

Kyverno labels & prose:
- openova.io/tenant → openova.io/organization (label rename for
  consistency; deployed clusters will add new label as a co-label
  during migration window)
- "tenant labels" / "tenant namespace" prose updated to
  "Organization labels" / "Organization-labeled namespace"
- Priority class names (tenant-high, tenant-default, tenant-batch)
  retained as deployed artifact names — rename pending in a
  separate migration ticket

No banned-term hits remain in component READMEs (verified by grep
in docs/GLOSSARY.md banned-terms table).

Refs #37
2026-04-27 20:06:51 +02:00
talent-mesh
435f49738d feat: restructure platform to 52 components and 9 products
Technology forecast and strategic review restructure:
- Remove 13 components (backstage, mongodb, activemq, vitess, airflow, camel, dapr, superset, searxng, langserve, trino, lago, rabbitmq)
- Add 10 components (sigstore, syft-grype, nemo-guardrails, langfuse, reloader, matrix, ferretdb, litmus, livekit, coraza)
- Rename product: Synapse → Axon (SaaS LLM Gateway)
- Merge products: Titan + Fuse → Fabric (Data & Integration)
- New product: Relay (Communication)
- Replace Backstage with Catalyst IDP
- Replace MongoDB with FerretDB (MongoDB wire protocol on CNPG)
- Add supply chain security (Sigstore/Cosign, Syft+Grype)
- Add AI safety and observability (NeMo Guardrails, LangFuse)
- Add technology forecast 2027-2030 document
- Full verification pass: zero stale references across all docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 21:00:19 +00:00
talent-mesh
10245dff98 feat: ecosystem expansion to 55 components with license compliance
- Replace BSL-licensed components with open-source alternatives:
  Terraform→OpenTofu (MPL 2.0), Vault→OpenBao (MPL 2.0),
  Redpanda→Strimzi/Kafka (Apache 2.0), n8n→Airflow (Apache 2.0)
- Add 14 new platform components: activemq, camel, clickhouse, dapr,
  debezium, falco, flink, iceberg, opensearch, rabbitmq, superset,
  temporal, trino, vitess
- Rename meta-platforms/ to products/ with new product names:
  Cortex (AI Hub), Fingate (Open Banking), Titan (Data Lakehouse),
  Fuse (Microservices Integration)
- Update all documentation, READMEs, and cross-references

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 18:15:11 +00:00
talent-mesh
bb53df55bb docs: comprehensive Kyverno policy matrix for resilience and zero-trust
Cover 44 policies across generate (VPA, PDB, NetworkPolicy, ResourceQuota,
LimitRange), mutate (topology spread, anti-affinity, security context,
seccomp, Harbor image rewrite, priority class), and validate (resource
requests, health probes, min replicas, pod security restricted profile,
image supply chain, network zero-trust, RBAC hardening).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 05:29:05 +00:00
talent-mesh
c9d04a53b4 refactor: flatten platform/ structure (41 components)
Remove hierarchical grouping (networking/, security/, etc.) and use flat
structure for all 41 platform components.

Changes:
- All components now directly under platform/ (no subfolders)
- AI Hub components moved from meta-platforms/ai-hub/components/ to platform/
- Open Banking components (lago, openmeter) moved to platform/
- meta-platforms/ now only contains README files that reference platform/
- Open Banking custom services remain in meta-platforms/open-banking/services/

Structure:
- platform/ (41 components, flat)
- meta-platforms/ai-hub/ (README only, references platform/)
- meta-platforms/open-banking/ (README + 6 custom services)

All documentation links updated.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 15:19:48 +00:00
talent-mesh
49f8bbc84d refactor: move harbor to registry/, kyverno to policy/
- Harbor moved from storage/ to registry/ (artifact management, not storage)
- Kyverno moved from security/ to policy/ (policy engine for validation,
  mutation, generation - broader than just security)

Updated structure:
- platform/registry/harbor/
- platform/policy/kyverno/

All documentation links updated accordingly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 11:53:21 +00:00
talent-mesh
535710289c feat: create OpenOva monorepo structure
Consolidate all component repos into a single monorepo:

- core/: Bootstrap + Lifecycle Manager application
- platform/: Individual component blueprints organized by category
  - networking/ (cilium, k8gb, external-dns, stunner)
  - security/ (cert-manager, external-secrets, vault, kyverno, trivy)
  - observability/ (grafana stack)
  - storage/ (minio, harbor, velero)
  - scaling/ (keda, vpa)
  - failover/ (failover-controller)
  - gitops/ (flux, gitea)
  - idp/ (backstage)
  - data/ (cnpg, mongodb, valkey, redpanda)
  - communication/ (stalwart)
  - iac/ (terraform, crossplane)
  - identity/ (keycloak)
- meta-platforms/: Bundled vertical solutions
  - ai-hub/ (enterprise AI platform)
  - open-banking/ (PSD2/FAPI fintech sandbox)
- docs/: Platform documentation (PLATFORM-TECH-STACK.md, SRE.md)

All internal links updated to use relative paths within monorepo.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 10:53:18 +00:00