Commit Graph

284 Commits

Author SHA1 Message Date
hatiyildiz
b00ec8f4df docs(pass-30): core/README catalyst-provisioner scope confusion + neo4j clean
core/README.md "User journeys" table had: "Sovereign bootstrap | Phase 0
done by catalyst-provisioner; this codebase contains the OpenTofu modules
under apps/provisioning/opentofu/..." — conflating two distinct services.

Per SOVEREIGN-PROVISIONING.md §2, catalyst-provisioner is a separate
Blueprint (bp-catalyst-provisioner) — explicitly "not part of any
Sovereign at runtime" — and lives outside core/. The core/apps/provisioning/
service is for runtime Application provisioning (validate configSchema,
compose manifests, commit to Environment's Gitea repo), an entirely
different concern from Phase 0 Sovereign bootstrap. Rewritten to call out
the separation.

platform/neo4j/README.md: clean.

Recurring shorthand note: ws.<env>.> JetStream subjects in core/README +
ARCHITECTURE (5 instances) treated as documented shorthand — precise form
per NAMING §11.2 is ws.{org}-{env_type}.>. Tightening deferred.

Validation log Pass 30 entry added.
2026-04-27 22:32:22 +02:00
hatiyildiz
4793cab8b6 docs(pass-29): DNS-placeholder sweep across canonical docs
The recurring drift: Catalyst control-plane DNS placeholders that omit the
<location-code> segment, producing forms like gitea.<sovereign>,
gitea.<sovereign>.<domain>, gitea.<sovereign-domain>, keycloak.<domain>.
Per NAMING §5.1 the canonical form is
{component}.{location-code}.{sovereign-domain} (e.g. gitea.hfmp.openova.io).
The shorter forms aren't just abbreviations — they collapse the multi-region
location dimension and re-drift every time a reader reads them as obvious
shorthand.

Fixes:
- CLAUDE.md "Customer Sync" — both gitea.<sovereign>/catalog/... lines.
- docs/SOVEREIGN-PROVISIONING.md §3 DNS-records bullet (3 lines) + §5
  Day-1 login line.
- docs/ARCHITECTURE.md §4 write-path Gitea label.
- docs/BLUEPRINT-AUTHORING.md §6.4 private-Blueprint Studio target.
- platform/librechat/README.md Keycloak issuer (Pass 22 marked clean and
  missed this — banner scans miss YAML-block drift).

platform/nemo-guardrails/README.md verified clean.

Final grep confirms only canonical forms remain. Validation log Pass 29
entry added with the recurring-drift-pattern note for future passes.
2026-04-27 22:30:41 +02:00
hatiyildiz
bbf1d58910 docs(pass-28): README + minio drift sweep — clean
Top-level README.md and platform/minio/README.md scanned against canonical
docs (GLOSSARY, ARCHITECTURE, NAMING, SECURITY, PLATFORM-TECH-STACK, SRE,
SOVEREIGN-PROVISIONING). No drift found.

Cross-checks recorded in the validation log entry:
- README's Keycloak/OpenBao/NATS phrasing matches Pass 6/7/27 reconciliations.
- README's bp-catalyst-provisioner reference matches SOVEREIGN-PROVISIONING §2.
- minio's bidirectional bucket replication is consistent with SRE §6 and is
  NOT the OpenBao active-active drift category (object storage replication
  is fine; the SECURITY §5 single-writer-per-region rule applies specifically
  to secrets-bearing Raft clusters).

Validation log Pass 28 (clean) entry added.
2026-04-27 22:28:33 +02:00
hatiyildiz
ec6e68a360 docs(pass-27): TECHNOLOGY-FORECAST mandatory/à-la-carte vs PLATFORM-TECH-STACK
opensearch was listed under "Mandatory Components" but per PLATFORM-TECH-STACK
§4.4 + §10 it is an Application Blueprint — customers install it (alongside
ClickHouse + bp-specter) only when they want the SIEM pipeline. Conversely
keycloak was under "A La Carte Components" but §2.1 places it inside the
Catalyst control plane (per-Org realms in SME, per-Sovereign realm in
corporate — present on every Sovereign).

Swapped the two entries and added a classification-basis banner above the
Mandatory section explicitly pointing at PLATFORM-TECH-STACK §2/§3/§4 so the
forecast's Mandatory/A-la-carte axis lines up with the architectural
categorization in canonical docs.

platform/milvus/README.md: clean.

Validation log Pass 27 entry added.
2026-04-27 22:27:09 +02:00
hatiyildiz
1a95866532 docs(pass-26): BUSINESS-STRATEGY OpenBao active-active drift + Catalyst conflation
§8.4 (CISO value prop) still described "OpenBao per-cluster with ESO PushSecrets
for cross-cluster secret sync" — the active-active model SECURITY §5 rejected
and Pass 7 corrected in component READMEs. Replaced with per-region independent
Raft + async Performance Replication; ESO scoped to in-region. Added the SPIFFE/
SPIRE 5-minute SVID line that fits the CISO frame.

§5.1 (Product Family) had two entries — "OpenOva (the core platform)" and
"OpenOva Catalyst (the platform)" — describing the same thing under two names.
Per GLOSSARY: OpenOva is the company, Catalyst is the platform. Removed the
duplicate "OpenOva" row, expanded the Catalyst row to absorb its content, and
added a Company/Platform/Sovereign vocabulary banner above the table.

§5.2 (Architecture Relationship diagram) had OPENOVA at the top as the platform.
Replaced with CATALYST + a footer clarifying each child is a composite Blueprint.

platform/matrix/README.md: clean.

Validation log Pass 26 entry added.
2026-04-27 22:24:50 +02:00
hatiyildiz
2c886daa52 docs(pass-25): llm-gateway DNS placeholders + IMPLEMENTATION-STATUS clean
platform/llm-gateway/README.md had three malformed DNS placeholders:
- KEYCLOAK_URL collapsed location-code + sovereign-domain into <domain> and
  used Application namespace `ai-hub` as a Keycloak realm name. Per NAMING §7
  and SECURITY §7, Keycloak realms are per-Org in SME-style or per-Sovereign
  in corporate-style — never per-Application-namespace. Fixed to
  `keycloak.<location-code>.<sovereign-domain>/realms/<org>`.
- ANTHROPIC_BASE_URL and `claude config set api_base` examples used
  `llm-gateway.ai-hub.<domain>/v1` — but NAMING §5.2 establishes
  Application endpoints as `{app}.{environment}.{sovereign-domain}`.
  Fixed to `llm-gateway.<env>.<sovereign-domain>/v1`.

docs/IMPLEMENTATION-STATUS.md confirmed clean: CRD list, surfaces, and
control-plane component list all match canonical docs.

Sweep concern logged for `harbor.<domain>` / `:latest` image patterns
appearing across many platform READMEs — to be addressed in a dedicated
sweep pass rather than asymmetrically here.

Validation log Pass 25 entry added.
2026-04-27 22:22:32 +02:00
hatiyildiz
329a36b54d docs(pass-24): SRE Alertmanager webhook URL form + livekit clean
SRE.md §12 (Alertmanager configuration) webhook URLs at lines 442/451 used
`gitea.<sovereign>.<domain>/...` — the two-segment placeholder is malformed
against NAMING §5.1 which establishes Catalyst control-plane DNS as
`{component}.{location-code}.{sovereign-domain}` (e.g. `gitea.hfmp.openova.io`).
Fixed both webhook URLs to `gitea.<location-code>.<sovereign-domain>/...`.

platform/livekit/README.md: clean — banner correct, integration tables
consistent with bp-cortex voice path.

Validation log Pass 24 entry added.
2026-04-27 22:20:17 +02:00
hatiyildiz
c98b7f32be docs(pass-23): PLATFORM-TECH-STACK §7 categorization split + §10 fictional bp-siem fix
Pass 23 — drift-detection on PLATFORM-TECH-STACK §6-§11 (less-
scrutinized in earlier passes) + platform/litmus.

§7.1 Resource estimates:
- Crossplane was listed under "Catalyst control plane" — but
  Crossplane is per-host-cluster infrastructure per §3.2. Same
  categorization slip pattern as the §3 topology fix in Pass 6.
- Split into:
  * §7.1 (Catalyst-specific only): +SPIRE server row that was
    missing; subtotal corrected to ~11.3 GB. Removed Crossplane.
  * New §7.4 (Per-host-cluster overhead): explicit breakdown for
    Cilium / Flux / Crossplane / cert-manager / ESO / Kyverno /
    Trivy / Falco / Harbor / MinIO / Velero / small operators.
    Subtotal ~8.8 GB per host cluster.
- §7.2 heading renamed "Per-Organization vcluster (workload
  regions)" for clarity.

§10 SIEM/SOAR:
- "This pipeline is itself a composite Blueprint (bp-siem)" — but
  bp-siem doesn't exist in §5's composite Blueprint inventory.
  The SIEM pipeline is a COMPOSITION of existing Application
  Blueprints (Strimzi + OpenSearch + ClickHouse + bp-specter on
  top of per-host-cluster Falco/Trivy/Kyverno), not a single
  packaged composite.
- Reworded to make the actual composition explicit. Audit-log
  fallback now correctly points at the Grafana stack
  (per-Sovereign observability) rather than implying SIEM is
  required for any audit retention.

platform/litmus/README.md: clean. Banner correct, integration
table consistent (Grafana, Kyverno, Gitea Actions, failover-
controller integrations all match the agreed model).

VALIDATION-LOG: Pass 23 entry added.

Refs #37
2026-04-27 22:15:40 +02:00
hatiyildiz
4e46559e25 docs(pass-22): PERSONAS Environment name fix — drop Sovereign prefix
Pass 22 — drift-detection on PERSONAS-AND-JOURNEYS + platform/librechat.
One real fix.

PERSONAS-AND-JOURNEYS.md §6.3 Environment view example:
- "Environment: bankdhofar-corp-banking-prod" — three-segment form
  implying Sovereign-Org-EnvType. But NAMING-CONVENTION §11.1
  establishes `{org}-{env_type}` — the Sovereign name is NOT in
  the Environment name. The Sovereign is determined by which
  Catalyst console you're logged into.
- This same doc's §4.2 (Layla narrative) explicitly says
  "Their internal Organizations are `core-banking`, `digital-
  channels`, `analytics`, `corporate-it`" — so the Org is
  `core-banking`, and the Environment in that Org for production
  is `core-banking-prod`.
- Fixed example to `core-banking-prod`.

platform/librechat/README.md: clean. The example
`namespace: ai-hub` is a customer-chosen Application namespace
(illustrative; the actual namespace would be the Cortex Application
name, customer-chosen).

VALIDATION-LOG: Pass 22 entry added.

Refs #37
2026-04-27 22:12:01 +02:00
hatiyildiz
a1f3076888 docs(pass-21): BLUEPRINT-AUTHORING §11 CI pipeline aligned with §2 monorepo fan-out
Pass 21 — drift-detection on BLUEPRINT-AUTHORING + platform/langfuse.
One real fix.

BLUEPRINT-AUTHORING.md §11 (CI pipeline):
- Old version showed `on: push # branch: main # tags: vX.Y.Z` — the
  per-Blueprint-repo CI shape that was explicitly rejected when we
  locked Option A (monorepo canonical) in Pass 1.
- §2 already establishes monorepo + path-matrix tag form
  `platform/<name>/v1.2.3` / `products/<name>/v1.2.3`. §11 should
  have matched §2 from the start; this slipped through previous
  passes.
- Rewrote §11: single root-level CI, on.pull_request.paths triggers
  validate, on.push.tags: platform/*/v* | products/*/v* triggers
  build-and-sign with tag-parse → folder-detect → fan-out publish.
  Includes worked example: tagging `platform/wordpress/v1.3.0`
  builds `platform/wordpress/` and publishes
  ghcr.io/openova-io/bp-wordpress:1.3.0.

platform/langfuse/README.md: clean. Banner correct. "Used by:
OpenOva Cortex" is acceptable commercial phrasing alongside the
technical bp-cortex reference.

VALIDATION-LOG: Pass 21 entry added.

Refs #37
2026-04-27 22:09:13 +02:00
hatiyildiz
5f028d1b6a docs(pass-20): SOVEREIGN-PROVISIONING placement YAML + Kyverno label drift
Pass 20 — drift-detection on SOVEREIGN-PROVISIONING + platform/kyverno.
Two real findings.

SOVEREIGN-PROVISIONING.md §8:
- "Existing Applications with `placement: active-active: false,
  single-region` do not migrate automatically" — invalid YAML
  mixing a boolean with an enum. The canonical placement model
  (per GLOSSARY) has `placement.mode: single-region | active-
  active | active-hotstandby`, no boolean toggle.
- Rewrote: "Existing Applications with `placement.mode: single-
  region` ... user explicitly switches Placement to active-active
  (or active-hotstandby) and adds the new region to
  placement.regions".

platform/kyverno/README.md:
- Policy V5 (minimum-replicas-production) targeted namespaces
  labeled `openova.io/env: production` — out-of-spec label name
  AND value. NAMING-CONVENTION §6 establishes `openova.io/env-type:
  prod` (hyphen-form, short value).
- Fixed to `openova.io/env-type: prod`.

Both findings show the same pattern: schema-level details that
survive grep-based banned-term checks but contradict the canonical
spec when read in body.

VALIDATION-LOG: Pass 20 entry added.

Refs #37
2026-04-27 22:06:24 +02:00
hatiyildiz
c83968877e docs(pass-19): SECURITY + kserve drift sweep — clean 2026-04-27 22:03:48 +02:00
hatiyildiz
b467dc3f3b docs(pass-18): NAMING DR-as-env_type misexample + Keycloak deployment topology
Pass 18 — drift-detection on NAMING-CONVENTION + platform/keycloak.
Two real findings.

NAMING-CONVENTION §11.1:
- The example list of Catalyst Environments included `bankdhofar-dr`
  — but `dr` is NOT a valid env_type. Canonical values per §2.4 are
  prod / stg / uat / dev / poc. DR is a Placement mode
  (active-active / active-hotstandby across regions inside the
  *-prod Environment), not a separate Environment.
- Replaced `bankdhofar-dr` with `bankdhofar-uat` and added an
  explicit "DR is a Placement, not an Env Type" note.

platform/keycloak/README.md:
- Keycloak Deployment YAML example used `namespace: open-banking`
  with 2 replicas — Fingate-specific narrative that contradicted
  the per-Org / per-Sovereign topology stated in the banner.
  Rewrote with two side-by-side examples:
  * shared-sovereign (3 HA replicas, catalyst-keycloak namespace,
    CNPG-backed)
  * per-organization (1 replica in <org> namespace, optional
    embedded DB for smallest SME tier)
- HA section was a single set of claims (2+ replicas, CNPG, Infinispan)
  that only matched corporate. Now branches on topology — corporate
  gets HA + Infinispan, SME gets single replica with restart-on-
  deploy as acceptable for tier SLAs.

Same kind of drift Pass 17 caught in Harbor: banner says one thing,
body still describes the older model. Both fixed.

VALIDATION-LOG: Pass 18 entry added.

Refs #37
2026-04-27 22:00:42 +02:00
hatiyildiz
eff264b077 docs(pass-17): ARCHITECTURE OAM table pipe-fix + Harbor README de-drift
Pass 17 — drift-detection sweep on ARCHITECTURE + harbor. Two real
findings.

ARCHITECTURE §13 (OAM table):
- `| Trait | Blueprint overlay (`overlays/small|medium|large`) |`
  has pipe chars inside backticks inside a Markdown table cell —
  a known GFM rendering hazard. Replaced with comma-separated
  examples.

platform/harbor/README.md:
- The banner added in Pass 9 said "every host cluster runs a
  Harbor instance" but the body still described an older
  "Harbor Primary / Harbor Replica" cross-region replication
  topology. Same shape of architectural drift Pass 7 caught in
  OpenBao/ESO/Gitea/Flux — banner-add doesn't rewrite the body.
- Three sections rewritten:
  * Overview mermaid: now shows upstream-OCI → multiple
    independent per-cluster Harbors with local Trivy scan + local
    Pod pulls.
  * "Multi-Region Replication" → "Per-host-cluster mirroring (NOT
    primary-replica)". Single source of truth = upstream OCI
    (ghcr.io/openova-io/* for Catalyst+Blueprints, customer CI for
    application images), not a "primary Harbor".
  * Example replication policy: was a `dest_registry` cross-region
    push policy → now a pull-mirror policy from ghcr.io with
    scheduled-cron trigger.
- "Why Mandatory" table reframed in per-host-cluster terms.

VALIDATION-LOG: Pass 17 entry added with the specific drift-detection
lesson — banner-addition passes don't catch body-level drift; need
explicit body re-reads.

Refs #37
2026-04-27 21:58:53 +02:00
hatiyildiz
71537d6a9d docs(pass-16): drift-detection sweep — clean (post-convergence routine) 2026-04-27 21:55:19 +02:00
hatiyildiz
b6a374df26 docs(pass-15): final banner sweep — 52/52 platform components covered, convergence achieved
Pass 15 swept all 52 platform/*/README.md files for the role-in-
Catalyst banner. 3 still lacked one (cnpg, flux, strimzi) and got
banners added:

- cnpg (§4.1): production Postgres; underlying engine for FerretDB +
  Gitea metadata.
- flux (§3.2): per-vcluster Flux + host-level Flux for Catalyst
  itself; pulls from single per-Sovereign Gitea.
- strimzi (§4.1): Application-tier event streaming; NOT the Catalyst
  control-plane spine (which uses NATS JetStream). Same upstream-
  tech-different-tier disambiguation pattern as Valkey.

CONVERGENCE: 52 / 52 platform components have role-in-Catalyst
banners. All cross-refs resolve. No banned terms. No architectural
drift detected on this pass.

VALIDATION-LOG: Pass 15 entry + "Convergence achieved (initial
banner sweep)" marker added. The validation loop continues per
the standing instruction — but subsequent passes will be brief
drift-detection sweeps rather than systematic rewrites.

Refs #37
2026-04-27 21:53:27 +02:00
hatiyildiz
9b3211fdee docs(pass-14): banners on workflow / analytics / metering / chaos / valkey (7 components)
Seven more Application Blueprint banners landed:

- temporal (§4.3): durable workflow orchestration; bp-fabric.
- flink (§4.3): stream + batch processing; bp-fabric.
- debezium (§4.2): CDC into Strimzi/Kafka; bp-fabric pipeline source.
- iceberg (§4.4): open table format on MinIO + archival S3.
- openmeter (§4.8): API metering for bp-fingate.
- litmus (§4.9): chaos engineering required by DORA / NIS2.
- valkey (§4.1): banner explicitly states NOT a Catalyst control-
  plane component — control plane uses NATS JetStream KV per
  ARCHITECTURE §5 / GLOSSARY event-spine. Valkey is Application-tier
  caching only. This is the disambiguation that PLATFORM-TECH-STACK
  §1 establishes ("same upstream technology can serve in multiple
  categories") — pinned in the per-component README so it can't be
  misread.

VALIDATION-LOG: Pass 14 entry added.

Refs #37
2026-04-27 21:52:03 +02:00
hatiyildiz
b021aaa57e docs(pass-13): role-in-Catalyst banners on 4 Communication Application Blueprints
All 4 communication components (composing under bp-relay) got role-
in-Catalyst banners pointing at PLATFORM-TECH-STACK §4.5:

- stalwart: JMAP/IMAP/SMTP self-hosted email.
- livekit: WebRTC SFU for video/audio/data; pairs with STUNner.
- stunner: K8s-native TURN/STUN for WebRTC NAT traversal.
- matrix: Matrix protocol via Synapse server. Banner explicitly
  disambiguates "Synapse" as the chat-server implementation, NOT
  the deprecated OpenOva product noun (retired in favor of bp-axon).

All 4 are explicitly Application Blueprints, NOT Catalyst control
plane.

VALIDATION-LOG: Pass 13 entry added.

Refs #37
2026-04-27 21:50:05 +02:00
hatiyildiz
9d95043ccc docs(pass-12): role-in-Catalyst banners on 11 AI/ML Application Blueprints
All AI/ML component READMEs got banners pointing at PLATFORM-TECH-
STACK §4.6 (AI/ML) or §4.7 (AI safety + observability), and noting
composition under bp-cortex (composite AI Hub Blueprint):

- knative: serverless for KServe-managed inference.
- kserve: K8s-native model serving for vLLM, BGE, custom.
- vllm: default LLM inference runtime.
- milvus: vector database for RAG retrieval.
- neo4j: knowledge-graph-augmented retrieval alongside Milvus.
- librechat: default chat surface, fronts LLM Gateway via Guardrails.
- bge: embedding generation + reranking.
- llm-gateway: outbound LLM routing (Claude, GPT-4, vLLM, Axon).
- anthropic-adapter: OpenAI-SDK → Anthropic translation.
- nemo-guardrails: AI safety firewall.
- langfuse: LLM observability (latency, tokens, cost, eval).

All 11 are explicitly Application Blueprints — NOT Catalyst control
plane. Catalyst's own observability stack (Grafana/OTel) covers
infrastructure; LangFuse covers AI-specific dimensions
(prompt/response/eval).

VALIDATION-LOG: Pass 12 entry added.

Refs #37
2026-04-27 21:47:45 +02:00
hatiyildiz
e9514b410d docs(pass-11b): retry banners on failover-controller/trivy/clickhouse/ferretdb (Edit needed Read first) 2026-04-27 21:45:56 +02:00
hatiyildiz
ae540269c4 docs(pass-11): banners on 7 more components + MinIO ILM label disambiguation
7 more component READMEs got role-in-Catalyst banners:

Per-host-cluster infrastructure:
- minio (§3.5): S3 fast-tier; tiers cold to cloud archival.
- velero (§3.5): K8s backup to archival S3 (NOT MinIO — that's
  fast-tier; backups land in cloud archival).
- failover-controller (§3.6): lease-based split-brain protection
  layered on k8gb; pointers to SRE §2.4 (witness pattern) +
  SECURITY §5.2 (OpenBao DR promotion).
- trivy (§3.3): CI + registry + runtime scan chain.

Application Blueprints (NOT control plane):
- opensearch (§4.1): explicitly framed as Application Blueprint —
  installed when an Org wants SIEM / full-text search / log analytics.
- clickhouse (§4.1): used by bp-fabric and SIEM cold-storage tier.
- ferretdb (§4.1): replication piggybacks on underlying CNPG.

MinIO ILM disambiguation:
- The Mermaid diagram had `ILM[Lifecycle Manager]` — confusable with
  the rejected Catalyst sub-product (per banned-terms list).
  Relabeled to `ILM[Information Lifecycle Manager - MinIO ILM]` to
  make clear it's MinIO's own feature, not the deprecated Catalyst
  Lifecycle Manager noun.

VALIDATION-LOG: Pass 11 entry added.

Refs #37
2026-04-27 21:45:28 +02:00
hatiyildiz
5834daec14 docs(pass-10): banners on 7 more components + opentofu active-active drift fix
7 more component READMEs got role-in-Catalyst banners:

- vpa, keda, reloader → per-host-cluster scaling/ops layer (§3.4).
  Reloader specifically calls out its role in Catalyst's secret-
  rotation flow (rolling deploy on K8s Secret hash change).
- external-dns → per-host-cluster DNS-sync (§3.1); pairs with k8gb
  for the GSLB zone separation.
- coraza → DMZ-block WAF on every host cluster (§3.1).
- crossplane → per-Sovereign on the management cluster (§3.2);
  banner explicitly emphasizes the agreed "never a user-facing
  surface" rule (Users don't write Compositions in Application
  configs; Blueprint authors and advanced contributors do). Cross-
  references the no-fourth-surface clause in ARCHITECTURE §4/§7
  and the Crossplane Composition section in BLUEPRINT-AUTHORING §8.
- opentofu → repositioned as Phase-0-only, runs on `catalyst-
  provisioner` only, NOT installed on host clusters at runtime.

opentofu drift fixes (uncovered by line-by-line read):
- Section 5 line 182: "Bootstrap Wizard prompts for cloud credentials"
  → "Catalyst Bootstrap (Phase 0) prompts for cloud credentials"
  (banned term).
- Same section line 186: "ESO PushSecrets sync to both regional
  OpenBao instances" — the active-active drift Pass 7 corrected
  elsewhere, still here. Replaced with "writes go to the primary
  OpenBao region only; replicas pick up via async perf replication".

VALIDATION-LOG: Pass 10 entry added.

Refs #37
2026-04-27 21:43:45 +02:00
hatiyildiz
a52bda30cb docs(pass-9b): retry banners on harbor / falco / sigstore / syft-grype
Pass 9's commit ea81c38 only landed banners on grafana + kyverno —
the harbor / falco / sigstore / syft-grype edits failed because the
Edit tool requires a Read pass per file before write. Now Read'd
and applied:

- harbor: per-host-cluster registry, pointer to PLATFORM-TECH-STACK §3.5.
- falco: per-host-cluster runtime security, pointer to §3.3 + SRE §10
  (SIEM/SOAR pipeline).
- sigstore: cosign signing chain on every Blueprint OCI artifact,
  Kyverno admission verifies signatures.
- syft-grype: CI-side SBOM + runtime CVE matching.

Pass 9 now complete.

Refs #37
2026-04-27 21:41:22 +02:00
hatiyildiz
ea81c38e15 docs(pass-9): role-in-Catalyst banners on grafana / harbor / falco / kyverno / sigstore / syft-grype
Pass 9 — six more component READMEs got Catalyst-role banners
matching the rule of thumb in CLAUDE.md (every platform/<x>/README.md
should state its role in Catalyst).

- grafana: observability stack on every host cluster; Catalyst's
  own self-monitoring + Application telemetry flows here.
- harbor: per-host-cluster container registry for Catalyst images,
  mirrored Blueprint OCI artifacts, customer images.
- falco: runtime security on every host cluster; feeds SIEM/SOAR.
- kyverno: policy engine on every host cluster; enforces Catalyst
  policy contracts (cosign on Blueprints, default-deny NetworkPolicies
  on Organization namespaces, priority-class injection).
- sigstore: cosign-signed Blueprint OCI artifacts + admission
  verification chain on every host cluster.
- syft-grype: SBOM generation in CI per Blueprint + runtime CVE scans.

Plus Kyverno priority-class clarification: prose around `tenant-high`
/ `tenant-default` / `tenant-batch` priority class names now reads
"Organization workloads" instead of "tenant workloads", with an
explicit note that the priority class artifact names themselves stay
as-is until a separate migration ticket renames them in deployed
clusters (renaming PriorityClass objects requires recreate, not
in-place rename).

VALIDATION-LOG: Pass 9 entry added.

Refs #37
2026-04-27 21:40:51 +02:00
hatiyildiz
14ed84de41 docs(pass-8): role-in-Catalyst banners + dead-link fix in component READMEs
Pass 8 — line-by-line read of platform/cnpg, platform/strimzi,
platform/k8gb, platform/keycloak, platform/cert-manager, platform/cilium.

CNPG and Strimzi: read in full and confirmed clean — they correctly
position themselves as Application Blueprints and don't drift from
the canonical model. CNPG's `<org>-postgres-dr` cluster name
(Application-tier database role) is acceptable per NAMING-CONVENTION
§1.3 (which only forbids primary/dr in K8s host-cluster names, not
in Application-internal CRD names).

Four READMEs updated:

k8gb:
- Header reframed: per-host-cluster infrastructure pointer to
  PLATFORM-TECH-STACK §3.1 and SRE §2.4 split-brain protection.
- Removed dead link to ../failover-controller/docs/ADR-FAILOVER-
  CONTROLLER.md (the failover-controller folder has no docs/);
  replaced with link to that component's README + SRE §2.4.

keycloak:
- Header reframed from "FAPI Authorization Server for Open Banking"
  (narrow) to "User identity for Catalyst Sovereigns" (broad).
  Keycloak handles ALL user identity in Catalyst, not just FAPI.
- Added per-Org / per-Sovereign topology callout matching SECURITY
  §6. Clarified that "Multi-tenant TPP" refers to PSD2 Third Party
  Providers, not Catalyst's Organization-level multi-tenancy.
- FAPI features kept since Keycloak still serves Fingate as the
  FAPI Authorization Server.

cert-manager:
- Header reframed as per-host-cluster infrastructure with pointer
  to PLATFORM-TECH-STACK §3.3.

cilium:
- Header reframed as per-host-cluster infrastructure with pointer
  to PLATFORM-TECH-STACK §3.1, including the install-first note
  (CNI must come before any other workload during Phase 0).

VALIDATION-LOG: Pass 8 entry added.

Refs #37
2026-04-27 21:39:03 +02:00
hatiyildiz
a5ffa1a716 docs(pass-7): align Gitea + Flux multi-region story; fix broken mermaid id
Continuing Pass 7 cleanup after the OpenBao/ESO rewrite (42aeb62).

Gitea README:
- Was describing "Bidirectional mirroring for multi-region" with two
  Gitea instances mirroring repos cross-region. Wrong: Catalyst's
  agreed model has one Gitea per Sovereign on the management cluster
  (PLATFORM-TECH-STACK §2.3). Replaced the multi-region mirror
  diagram with a single-Gitea + intra-cluster HA topology and added
  a "Why not cross-region bidirectional mirror" explainer (write-
  conflict semantics would break EnvironmentPolicy enforcement).
- Status banner: notes the canonical references.
- Backup section: removed "Repository mirror for redundancy"
  (replaced with Velero scheduled backups).

Flux README:
- "Multi-Region GitOps" section was showing one Gitea per region
  with bidirectional mirror. Replaced with one Gitea per Sovereign
  topology. Per-vcluster Flux pulls from this single Gitea.

Mermaid syntax bug:
- Earlier mass replace_all of "Catalyst IDP" → "Catalyst console"
  had left an invalid mermaid node identifier
  `Catalyst console[Catalyst console]` (mermaid forbids spaces in
  node IDs). Fixed to `Console[Catalyst console]`. Would have
  rendered as a broken diagram on GitHub.

VALIDATION-LOG: Pass 7 entry added documenting the OpenBao/ESO
active-active rewrite (the most consequential drift fix in any pass).

Refs #37
2026-04-27 21:36:20 +02:00
hatiyildiz
42aeb629bb docs(pass-7): rewrite OpenBao + ESO READMEs to match agreed multi-region semantics
Pass 7 — line-by-line read of platform/openbao/README.md and
platform/external-secrets/README.md found a major architectural drift:
both files described an OLD active-active bidirectional sync model
that contradicts docs/SECURITY.md §5 (the canonical reference).

The active-active design was rejected during the architecture session
because it would have been a stretched cluster — a single region's
network blip would block writes everywhere. The agreed model is:

- Independent Raft cluster per region (intra-region quorum only).
- Single-primary writes; replicas accept reads only.
- Async Performance Replication primary → replicas (lag <1s typical).
- Explicit DR promotion (sovereign-admin or failover-controller).

Fixes:

platform/openbao/README.md:
- Overview: removed "active-active deployments" / "either region can
  update secrets". Replaced with "independent Raft cluster per region",
  "asynchronous Performance Replication".
- Architecture diagram: replaced bidirectional-push diagram with the
  primary→replicas async perf replication topology that matches
  SECURITY.md §5.
- ClusterSecretStores: simplified from "two stores (local+remote)" to
  "one local store"; reads always pull locally.
- Renamed "PushSecret (Bidirectional)" → "Writes go to the primary
  region" with a single-target PushSecret pointing at bao-primary.
- Added DR promotion section pointing at SECURITY.md §5.2.
- Status banner: notes that the canonical multi-region reference is
  SECURITY.md.

platform/external-secrets/README.md:
- Header line: repositioned as per-host-cluster infrastructure with
  pointer to PLATFORM-TECH-STACK §3.3.
- Removed broken link to non-existent ../openbao/docs/ADR-OPENBAO.md
  (replaced with link to ../openbao/README.md).
- "Multi-region sync | Push to both OpenBao instances simultaneously"
  → "Multi-region reads | Async perf replication".
- "PushSecret to Multiple OpenBao Instances" example was writing to
  two ClusterSecretStores in parallel — replaced with single-target
  primary write.
- "Multi-region sync via single PushSecret" in Consequences →
  "Cross-region availability via Performance Replication".
- Mermaid sequence diagram: "Bootstrap Wizard" actor → "Catalyst
  Bootstrap (Phase 0)"; "Terraform" → "OpenTofu"; ESO connection
  description "via K8s auth" → "via SPIFFE SVID (workload identity)".

These were the most consequential drift fixes found in any pass —
two READMEs were documenting an architecture explicitly rejected by
the agreed model.

Refs #37
2026-04-27 21:34:09 +02:00
hatiyildiz
8072b012b9 docs: record Pass 6 entry in VALIDATION-LOG 2026-04-27 21:30:59 +02:00
hatiyildiz
fec0c342a8 docs(pass-6): reconcile topology diagram + unify JetStream Account scoping
Pass 6 — fresh-eyes line-by-line read of ARCHITECTURE.md. Found two
internal contradictions that earlier passes missed.

ARCHITECTURE §3 (topology diagram) listed Crossplane, Flux, Harbor,
and grafana-stack INSIDE the Catalyst control plane block. But §11
(Catalyst-on-Catalyst) explicitly says these are per-host-cluster
infrastructure, NOT Catalyst control-plane components. PLATFORM-TECH-
STACK §3 also classifies them as per-host-cluster.

Fixed: §3 topology diagram now shows only true Catalyst control-plane
components (console, marketplace, admin, catalog-svc, projector,
provisioning, environment-controller, blueprint-controller, billing,
gitea, nats-jetstream, openbao, keycloak, spire-server, observability)
and adds a separate line for "Plus per-host-cluster infrastructure"
that defers to PLATFORM-TECH-STACK §3 for the full list (Cilium, Flux,
Crossplane, cert-manager, ESO, Kyverno, Harbor, Reloader, Trivy, Falco,
Sigstore, Syft+Grype, VPA, KEDA, External-DNS, k8gb, Coraza, MinIO,
Velero, failover-controller). Also added the previously-missing
`provisioning` row.

JetStream Account scoping was contradictory:
- ARCHITECTURE §5 said "Per-Org account: ws.{org}-{env_type}.>" —
  reads ambiguously: is the Account per-Org or per-Env?
- NAMING-CONVENTION §11.2 said "One JetStream Account scoped to
  ws.{org}-{env_type}.>" — implied per-Environment.
- GLOSSARY + PLATFORM-TECH-STACK + SECURITY all say per-Organization.

Reconciled to the per-Org-Account-with-per-Env-subjects model:
- Account isolation: ONE NATS Account per Organization.
- Subjects within the Account use prefix `ws.{org}-{env_type}.>` for
  per-Environment partitioning.

This is the cleanest isolation model: Accounts are NATS' strongest
isolation boundary (per-Org); subjects partition further within each
Account (per-Env).

Refs #37
2026-04-27 21:30:03 +02:00
hatiyildiz
7298a7ddca docs(pass-5c): add VALIDATION-LOG.md — trail of multi-pass integrity work
Concluding the validation loop with a process artifact. The new file
records:

- Why the validation existed (post-rewrite trust verification).
- Each pass's scope and concrete fixes (16 iterations across Pass 1
  + sweeps in Passes 2/3/4/5).
- The acceptance criteria as runnable grep commands so any future
  contributor can re-verify.
- Authorship convention (hatiyildiz, per-commit identity flags).
- Re-validation cadence (after rewrites, after new banned terms,
  after component renames, quarterly drift check).

Linked from README.md docs table.

This file is meant as a playbook for the next validation, not a
status snapshot — for status, IMPLEMENTATION-STATUS.md remains
canonical.

Refs #37
2026-04-27 21:27:40 +02:00
hatiyildiz
ba048d2fd7 docs(pass-5b): scrub remaining "instance" usages where "Application" is meant
Two user-facing residuals where the banned product term "instance"
slipped through:

- docs/ARCHITECTURE.md §9: example console dialog "Use existing
  instance or create a dedicated one?" → "Use an existing Postgres
  Application or create a new dedicated one?". This is a UI prompt
  text — must use the user-facing noun "Application", not "instance".

- docs/NAMING-CONVENTION.md §6.2 tag comment: "Application instance
  name" → "Application name within the Environment". The CRD might
  internally still use the noun Instance for class-vs-instance
  semantics, but in tag annotations and user-visible context the
  Application IS the instance.

Other "instance" occurrences confirmed legitimate (Postgres instance
as Crossplane resource type, Flux instance as software deployment,
EC2/Hetzner instance as cloud-provider terminology) and retained.

Final cross-reference check: all Markdown links across all canonical
docs resolve. No residual banned terms.

Refs #37
2026-04-27 21:26:27 +02:00
hatiyildiz
79c59a27a2 docs(pass-5): reconcile Phase-0 install order, IMPLEMENTATION-STATUS section numbering
Pass-5A — fresh-eyes deep read found two structural drifts.

ARCHITECTURE §10 Phase-0 install order:
- Old order: cert-manager → Cilium → Flux → ... → Catalyst control plane.
- SOVEREIGN-PROVISIONING §3 has the correct order: Cilium first
  (CNI must be in place before pods can network), THEN cert-manager.
- ARCHITECTURE updated to match: Cilium → cert-manager → Flux →
  Crossplane → Sealed Secrets → SPIRE → JetStream → OpenBao →
  Keycloak → Gitea → Catalyst control plane (11 items, matching
  the SOVEREIGN-PROVISIONING list which had Keycloak and Gitea
  spelled out separately).

IMPLEMENTATION-STATUS section numbering:
- Old: §1 → §2 → §2bis → §3 → §4 → §5 → §6 → §7 → §8.
  The "§2bis" was a workaround for inserting per-host-cluster
  infrastructure without renumbering. Reads weird.
- New: §1 → §2 → §3 → §4 → §5 → §6 → §7 → §8 → §9. Clean numbering.

Refs #37
2026-04-27 21:25:07 +02:00
hatiyildiz
d1a2ed73a3 docs(pass-4): align ARCHITECTURE phase numbering with SOVEREIGN-PROVISIONING
ARCHITECTURE §10 listed 3 provisioning phases (Phase 0 / 1 / 2) and
labeled Phase 2 as "Self-sufficient". SOVEREIGN-PROVISIONING.md uses
4 phases (Phase 0 Bootstrap / Phase 1 Hand-off / Phase 2 Day-1 setup
/ Phase 3 Steady-state). The same phase number meant different things
in the two docs.

Aligned ARCHITECTURE to the 4-phase numbering. SOVEREIGN-PROVISIONING
is now explicitly the canonical reference for phase semantics.

Refs #37
2026-04-27 21:22:07 +02:00
hatiyildiz
f4e99bb882 docs(pass-3): normalize muscatpharmacy Org-slug example consistency
PERSONAS-AND-JOURNEYS and SECURITY were using two competing slugs
for the same example Organization:
- "muscat-pharmacy" (with hyphen) — used as Org name + Environment
  name in the Ahmed journey narrative.
- "muscatpharmacy" (no hyphen) — used as the vcluster name in the
  same paragraph, and used everywhere else (NAMING-CONVENTION
  examples, ARCHITECTURE topology diagram, SECURITY SPIFFE ID).

NAMING §2.5 allows both spellings (Org slug regex permits hyphens).
But within a single example the spelling must be stable, otherwise
readers see a contradiction between Org and vcluster names.

Normalized to single-token "muscatpharmacy" throughout (matches the
predominant usage and produces simpler URLs / paths).

Result: all docs now show the same example Org consistently —
muscatpharmacy as Org, muscatpharmacy as vcluster, muscatpharmacy-prod
as Environment, gitea.omantel.openova.io/muscatpharmacy/muscatpharmacy-prod
as Environment Gitea repo.

Refs #37
2026-04-27 21:20:52 +02:00
hatiyildiz
b810002b16 docs(pass-3): align IMPLEMENTATION-STATUS with PLATFORM-TECH-STACK §2/§3 split
After the PLATFORM-TECH-STACK reorganization (§2 = Catalyst control
plane, §3 = per-host-cluster infrastructure), IMPLEMENTATION-STATUS
§2 was still mixing the two — listing cilium, k8gb, kyverno, falco,
etc. under "Catalyst control plane components" alongside console,
projector, etc.

Split into:
- §2 (renumbered subsections 2.1, 2.2): Catalyst control plane only —
  the per-Sovereign components that make a cluster a Sovereign.
- §2bis: Per-host-cluster infrastructure — the substrate every host
  cluster needs (Cilium, Flux, Crossplane, cert-manager, ESO, Kyverno,
  Trivy, Falco, Sigstore, Syft+Grype, VPA, KEDA, Reloader, MinIO,
  Velero, Harbor, failover-controller).

Status flags retained per component (📐 design / 🚧 README only / 
implemented / ⏸ deferred). All per-host-cluster components currently
🚧 (READMEs exist; none yet packaged as deployable Blueprints).

This brings IMPLEMENTATION-STATUS into 1:1 correspondence with the
PLATFORM-TECH-STACK §2 / §3 / §4 categorization that other docs
reference.

Refs #37
2026-04-27 21:19:57 +02:00
hatiyildiz
d6a51b8a7a docs(pass-2): final entity-noun sweep — external-secrets sequence diagram
Pass 2 — fresh-eyes sweep across the entire docs tree. One residual
entity-noun usage found:

- platform/external-secrets/README.md:75 (in a Mermaid sequence
  diagram): "Note over Wizard: Operator saves unseal keys offline"
  — "Operator" used as person/entity. Renamed to "sovereign-admin"
  to match the role from GLOSSARY.md.

All other banned-term sweeps clean:
- No tenant (architectural) anywhere.
- No Catalyst IDP anywhere.
- No Synapse-as-product anywhere (only the legitimate
  "Matrix/Synapse server" usages).
- No workspace-controller (only the banned-term entries that define
  the rename).
- No capital-W Workspace as Catalyst scope.
- No github.com/openova (without -io).
- All cross-doc Markdown links resolve.
- All §X references resolve to the new section numbering after
  PLATFORM-TECH-STACK reorg.
- API group catalyst.openova.io/v1alpha1 consistent across 6 references.
- OCI artifact prefix `bp-` consistent across README, CLAUDE,
  BLUEPRINT-AUTHORING, IMPLEMENTATION-STATUS.

Other "Operator" mentions intentionally retained (legitimate
technical usage):
- "External Secrets Operator (ESO)", "Trivy Operator" — K8s
  Operator pattern (controllers), explicitly allowed by GLOSSARY.
- "Operator compatibility" in BUSINESS-STRATEGY's OpenShift migration
  table — refers to compatibility with K8s Operators (the technology),
  not as an entity/role.

Refs #37
2026-04-27 21:18:55 +02:00
hatiyildiz
15905cee6f docs(iter-9-12): repo structure clarity, PLATFORM-TECH-STACK reorg, SRE alignment
README + CLAUDE.md (iter 9):
- README's "Build a Blueprint" section was contradicting itself: said
  "A Blueprint is a Git repo" while elsewhere we'd locked in the
  monorepo decision. Rewritten: Blueprint = a folder under
  platform/<name>/ or products/<name>/ in this monorepo. CI publishes
  per-folder OCI artifacts.
- CLAUDE.md "Repo structure": replaced the brief tree with a more
  honest one that distinguishes target structure from current
  placeholders (core/apps/ is target console+projector+...; current
  has only legacy bootstrap/ and manager/ .gitkeep dirs). Annotated
  each products/<name>/ folder with current state (axon = real code;
  others = README only; catalyst = bootstrap/ui scaffold).
- CLAUDE.md banned-terms entry "Workspace": now covers component
  names too (was only Catalyst scope), matching GLOSSARY's expanded
  banned-term entry.

PLATFORM-TECH-STACK (iter 10) — substantive reorganization:

The §1 categorization established three buckets:
  (a) Catalyst control plane (per-Sovereign on mgt)
  (b) Per-host-cluster infrastructure (every host cluster)
  (c) Application Blueprints (a la carte)

But §2 "Catalyst control plane components" was mixing buckets (a)
and (b): it listed flux, crossplane, cert-manager, kyverno, harbor,
external-secrets, reloader, vpa, keda, k8gb, coraza, falco, trivy,
sigstore, syft-grype, minio, velero, failover-controller all under
"Catalyst control plane" — but those are per-host-cluster
infrastructure per §1, and §1 itself said Crossplane "Never
user-facing" / per-host-cluster.

Reorganized §2 + §3:
- §2 now contains ONLY the Catalyst control plane:
    2.1 User-facing surfaces (console, marketplace, admin)
    2.2 Catalyst backend services (projector, catalog-svc, provisioning,
        environment-controller, blueprint-controller, billing)
    2.3 Per-Sovereign supporting services (keycloak, openbao, spire-
        server, nats-jetstream, gitea, observability)
- New §3 Per-host-cluster infrastructure with subsections for
  networking, GitOps+IaC, security+policy, scaling+ops, storage+
  registry, resilience.
- Application Blueprints renumbered §3 → §4. Added missing
  opensearch row to §4.1 (was previously misplaced in observability).
- Composite Blueprints (Products) §4 → §5.
- Multi-Region §5 → §6. Resource estimates §6 → §7. Cluster
  deployment §7 → §8. User choice §8 → §9. SIEM §9 → §10. License §10 → §11.

Cross-doc references to PLATFORM-TECH-STACK §1 / §2 (in NAMING,
ARCHITECTURE, IMPLEMENTATION-STATUS) all still resolve correctly
under the new numbering.

SRE (iter 11):
- §2.4 split-brain table: "MongoDB" → "FerretDB" (MongoDB was
  retired in favor of FerretDB-on-CNPG per project-memory).
- §2.5 data replication: clarified each row's layer (Application
  Blueprint vs per-host-cluster vs Catalyst control plane) instead
  of misclassifying MinIO/Harbor as Application Blueprints. Added
  OpenSearch row.
- §3.1 Flagger and §3.2 Flipt: explicitly marked "Status: design,
  not yet a deployed Blueprint" since they're "components to watch"
  in TECHNOLOGY-FORECAST, not in the current PLATFORM-TECH-STACK §3
  inventory.

BUSINESS-STRATEGY + TECHNOLOGY-FORECAST (iter 12):
- Final scan: clean. No tenant/operator-team/Catalyst-IDP/Lifecycle
  Manager/Synapse(product) violations remaining.

Refs #37
2026-04-27 21:17:15 +02:00
hatiyildiz
8d351d7001 docs(iter-6-8): security/provisioning/blueprint corrections + OCI artifact naming
SECURITY (iter 6):
- "Environment repo" → "Environment Gitea repo" in §3 secrets diagram.
- "ChangePolicy enforces approvals" → "EnvironmentPolicy enforces
  approvals" in §9 SOC2 row (ChangePolicy was a fictional CRD —
  EnvironmentPolicy is the real one defined in ARCHITECTURE §8).
- "Catalyst's compliance-controller surfaces evidence" → "evidence
  surfaced via Catalyst console audit views and SIEM exports"
  (compliance-controller wasn't defined elsewhere; this avoids
  inventing new components in compliance prose).

SOVEREIGN-PROVISIONING (iter 7):
- "vault-stored" → "stored in OpenBao on the provisioner"
  (Vault was replaced by OpenBao; "vault-stored" was generic English
  but read as a contradiction).

BLUEPRINT-AUTHORING (iter 8):
- OCI artifact naming locked: `ghcr.io/openova-io/bp-<name>:<semver>`
  where `<name>` is the folder name. The `bp-` prefix lives in the
  OCI artifact name (self-identifying), not the folder name.
  Fixed in §1, §10, §11, §13 — and propagated to README.md so the
  pattern is consistent across the repo.
- Crossplane Composition example: `compositeTypeRef.apiVersion`
  changed from `bp-wordpress.openova.io/v1alpha1` (per-Blueprint
  group, ugly) to `compose.openova.io/v1alpha1` (shared XRD group
  across all Blueprints).
- §11 CI pipeline final step: "publish blueprint.yaml as the
  manifest" → "as the OCI manifest's metadata layer" (clearer about
  what it does in the OCI sense).

Refs #37
2026-04-27 21:12:14 +02:00
hatiyildiz
80b91709e1 docs(iter-3-5): purge operator-as-entity, fix Workspace-controller capital, JetStream KV references
ARCHITECTURE (iter 3):
- Removed catalystctl from the §4 write-side diagram (it's read-only;
  presenting it as a write input contradicted §7.4).
- "Both tabs read the same Valkey snapshot" → "JetStream KV snapshot"
  in §5 (Valkey is no longer in the control plane).
- §7.4: catalystctl reframed as "may exist as small read-only debug
  CLI" rather than implying it ships today.
- §11 dependency list: added bp-catalyst-provisioning; removed
  bp-catalyst-crossplane (Crossplane is per-host-cluster infra, not a
  Catalyst control-plane component); added clarifying note.
- §12 CRD list: added SecretPolicy + Runbook (were already in
  IMPLEMENTATION-STATUS but missing from the principles table).
- §2 SME-style description: "SaaS Operator team (Omantel staff)" →
  "SaaS provider's cloud team" (Operator banned as entity).

NAMING-CONVENTION (iter 4):
- §5.1 heading "operator domain" → "Sovereign domain".
- §7 multi-region diagram: replaced piecemeal Catalyst component list
  with a deferral to PLATFORM-TECH-STACK §2; added SPIRE server;
  fixed "per-Org workspaces" → "per-Environment Gitea repos"; added
  per-host-cluster infrastructure callout.

SECURITY (iter 6 — partial; fold into this commit):
- "operator-approved" → "sovereign-admin-approved" for DR promotion.
- Realm name "catalyst-operator" → "catalyst-admin" (entity-noun
  scrubbed from the realm naming itself).

SOVEREIGN-PROVISIONING (iter 7 — partial):
- "single operator's laptop" → "single person's laptop" (avoid
  "operator" as entity).
- "the next operator" → "the next Sovereign provisioning request,
  regardless of who initiates it".
- "catalyst-operator realm" → "catalyst-admin realm" (×2).
- Capital-W "Workspace-controller" residuals (3) → "Environment-
  controller" (replace_all is case-sensitive; previous iter caught
  lowercase only).

PERSONAS (iter 5):
- P3 "within a Sovereign Operator team" → "within a Sovereign's
  operations team".
- Two capital-W "Workspace-controller" residuals fixed.

SRE (iter 11 — partial):
- §13.2 "Workspace-controller stuck" runbook entry →
  "Environment-controller stuck".

Banned-term sweep result post-fix: no `Operator team|role|account|
user|admin` anywhere; no capital-W Workspace as Catalyst scope;
no Valkey-as-control-plane refs.

Refs #37
2026-04-27 21:09:31 +02:00
hatiyildiz
27325edb32 docs(iter-2): glossary alignment — rename workspace-controller, fix definitions
GLOSSARY.md line-by-line audit. Eight corrections.

1. workspace-controller → environment-controller everywhere. The
   controller reconciles the Environment CRD; "workspace" is banned as
   a Catalyst scope, so it cannot be in a component name either. Fixed
   in: GLOSSARY, ARCHITECTURE, PLATFORM-TECH-STACK, NAMING-CONVENTION,
   SOVEREIGN-PROVISIONING, IMPLEMENTATION-STATUS, core/README,
   BUSINESS-STRATEGY. Banned-term entry in GLOSSARY now explicitly
   covers component names too.

2. "workspace repos" (per-Environment Gitea repos) → "Environment
   Gitea repos" in GLOSSARY, PLATFORM-TECH-STACK.

3. JWT claim {workspace, org, role} → {environment, org, role} in
   ARCHITECTURE projector diagram.

4. OpenOva definition refined: was "Never used to name a product",
   which contradicted "OpenOva Catalyst", "OpenOva Cortex". Now: brand
   prefix in product names; bare "OpenOva" = the company; bare
   "Catalyst" = the platform.

5. Catalyst definition completed: was missing provisioning, billing,
   gitea, observability — now lists all 14 control-plane components,
   pointing at the table below.

6. Catalyst components table: added `provisioning` (validates
   configSchema, commits to Environment Gitea); reordered to match
   ARCHITECTURE §3 grouping; clarified each component's source-of-truth
   (catalog-svc reads monorepo + Gitea, blueprint-controller watches
   monorepo + Gitea, etc.).

7. Environment definition: refers to NAMING §2.4 for env_type values;
   removed inline list that didn't match canonical ordering. Added
   concrete examples (acme-prod, acme-dev, bankdhofar-uat).

8. Application example: dropped "RocketChat" which appeared nowhere
   else; replaced with generic "running deployment" plus the
   established WordPress / Postgres examples.

9. sovereign-admin description: was "runs Crossplane" — Crossplane is
   platform plumbing not user-facing. Now: "manages the underlying
   clusters via Crossplane (which is platform plumbing, not a
   user-facing surface)".

Banned-term coverage:
- "Workspace" entry now covers BOTH the Catalyst scope AND component
  naming (workspace-controller → environment-controller).

Refs #37
2026-04-27 21:06:09 +02:00
hatiyildiz
2c4902b409 docs(iter-1): add IMPLEMENTATION-STATUS, fix wrong-org refs, reconcile monorepo
First validation iteration. Three concrete corrections.

1. Add docs/IMPLEMENTATION-STATUS.md as the bridge between target
   architecture and current code state. Status legend ( / 🚧 / 📐 / ⏸)
   applied per-component. Catalyst control plane = mostly 📐. Component
   READMEs = 🚧 (README only, no Blueprint manifests yet). products/axon
   =  (only product with real code). core/ = 📐 (just .gitkeep).

2. Status banner added to ARCHITECTURE, SECURITY, SOVEREIGN-PROVISIONING,
   BLUEPRINT-AUTHORING, PERSONAS-AND-JOURNEYS, PLATFORM-TECH-STACK, SRE
   pointing readers at IMPLEMENTATION-STATUS.md before they treat any
   described feature as built. GLOSSARY also references it.

3. Architectural decision (Option A — monorepo canonical):
   - Each platform/<name>/ and products/<name>/ folder is the source of
     ONE Blueprint, published as ghcr.io/openova-io/<name>:<semver> by
     CI fan-out from the monorepo root.
   - BLUEPRINT-AUTHORING.md §1, §2, §13 rewritten to match.
   - README.md "what's in this repo" rewritten to clarify monorepo +
     OCI-fan-out shape; no longer claims every directory is a Blueprint
     in a way that contradicts BLUEPRINT-AUTHORING.

Wrong-org fixes (3 places):
   - docs/PERSONAS-AND-JOURNEYS.md:13   github.com/openova → openova-io
   - docs/BLUEPRINT-AUTHORING.md:13     github.com/openova → openova-io
   - docs/BLUEPRINT-AUTHORING.md:404    github.com/openova → openova-io
   - docs/BLUEPRINT-AUTHORING.md ghcr.io/openova/* (3 refs) → openova-io

API group consistency:
   - All references unified to catalyst.openova.io/v1alpha1
     (was mixed v1 / v1alpha1; v1alpha1 is correct since the CRDs are
     design-stage with no implementation).

core/README.md updated to honestly describe the directory tree as
"target structure with .gitkeep placeholders" rather than implying
the apps/console, apps/projector, etc. binaries already exist.
The legacy apps/bootstrap and apps/manager directories are
acknowledged as transitional placeholders that will be removed when
the new apps/ layout is scaffolded.

CLAUDE.md and .claude/project-memory.md updated to put
IMPLEMENTATION-STATUS.md second in the read-first ordering.

Refs #37
2026-04-27 20:43:31 +02:00
hatiyildiz
119a1e53a0 docs(components): terminology pass across platform and product READMEs
Bring per-component READMEs in line with the canonical glossary
(docs/GLOSSARY.md). Substantive architectural content unchanged —
this is a terminology + reference correctness pass.

Placeholder rename: <tenant> → <org> in YAML / IaC examples across
- platform/cnpg/README.md           (Cluster + Pooler + ScheduledBackup)
- platform/debezium/README.md       (PostgreSQL connector + topic patterns)
- platform/external-secrets/README.md (ExternalSecret / SecretStore)
- platform/grafana/README.md        (Instrumentation namespace)
- platform/k8gb/README.md           (Gslb + namespace + kubectl examples)
- platform/keda/README.md           (ScaledObject + Kafka triggers + Prometheus)
- platform/opentofu/README.md       (server resource example)
- platform/velero/README.md         (BackupStorageLocation buckets)
- platform/vpa/README.md            (VerticalPodAutoscaler examples)
- platform/flux/README.md           (kustomization name + tenants/ → organizations/)

"Catalyst IDP" → "Catalyst console":
- platform/crossplane/README.md     (integration section retitled and
                                      rewritten — Crossplane is platform
                                      plumbing, not user-facing)
- platform/gitea/README.md          (architecture diagram + integration table)
- platform/kyverno/README.md        (rollout tracking surface)
- products/fingate/README.md        (TPP onboarding portal)

"Bootstrap wizard" → "Catalyst bootstrap":
- platform/openbao/README.md        (bootstrap procedure rewritten —
                                      independent Raft per region clarified;
                                      cross-references docs/SECURITY.md §5)
- platform/opentofu/README.md       (Quick Start)

Kyverno labels & prose:
- openova.io/tenant → openova.io/organization (label rename for
  consistency; deployed clusters will add new label as a co-label
  during migration window)
- "tenant labels" / "tenant namespace" prose updated to
  "Organization labels" / "Organization-labeled namespace"
- Priority class names (tenant-high, tenant-default, tenant-batch)
  retained as deployed artifact names — rename pending in a
  separate migration ticket

No banned-term hits remain in component READMEs (verified by grep
in docs/GLOSSARY.md banned-terms table).

Refs #37
2026-04-27 20:06:51 +02:00
hatiyildiz
b857f46706 docs(strategy,forecast): terminology pass — Catalyst as platform, console not IDP
Targeted updates to BUSINESS-STRATEGY.md §5.1 and §9.2 plus
TECHNOLOGY-FORECAST §removed-components.

- BUSINESS-STRATEGY.md §5.1: OpenOva Catalyst row repositioned. It is
  the platform itself (the self-sufficient Kubernetes-native control
  plane that turns any cluster into a Sovereign), not a sub-product
  bundling bootstrap+IDP+lifecycle manager. Other OpenOva products
  (Cortex, Fingate, Fabric, Relay, Specter, Axon) run ON Catalyst as
  composite Blueprints.

- BUSINESS-STRATEGY.md §9.2: capability matrix "Developer portal" cell
  updated from "Catalyst IDP" to "Catalyst console" — IDP function is
  one of the console's responsibilities, not a separate product.

- TECHNOLOGY-FORECAST.md §removed-components: Backstage row updated to
  describe replacement as "Catalyst console (the platform's own
  developer-facing UI)" rather than the now-retired "Catalyst IDP"
  sub-product.

Strategy narrative, market segmentation, pricing model, and migration
playbook are unchanged — they stand on their own.

Refs #37
2026-04-27 20:06:31 +02:00
hatiyildiz
4b3a6884f5 docs(stack,sre): align tech stack and SRE handbook with Catalyst control plane
Two related rewrites that put the control plane / application Blueprint
distinction front and center.

PLATFORM-TECH-STACK.md
  - §1: explicit three-way component categorization — Catalyst control
    plane (one per Sovereign), per-host-cluster infrastructure (every
    cluster), Application Blueprints (inside per-Org vclusters).
  - §2: Catalyst control plane components listed by responsibility —
    user-facing surfaces, backend services, identity, secrets, event
    spine, GitOps, networking, security, scaling, storage,
    observability, resilience.
  - §3: Application Blueprints (the a-la-carte catalog) — Valkey and
    Strimzi explicitly callout that they are Application Blueprints,
    NOT control-plane components (control plane uses NATS JetStream).
  - §4: composite Blueprints (Cortex, Axon, Fingate, Fabric, Relay)
    repositioned as Applications running ON Catalyst, not as parallel
    products.
  - §5: multi-region diagram showing independent OpenBao Raft per
    region, NATS leaf nodes, Crossplane on mgt.
  - §6: resource estimates updated for control plane (~12 GB +
    per-Org Keycloak in SME tier).
  - §10: license posture table — every control-plane component carries
    a redistribution-safe license (no BSL).

SRE.md
  - §2: multi-region principles updated; explicit "no stretched
    clusters" applies to OpenBao, JetStream, etcd, every quorum-
    based component.
  - §2.5: data replication patterns now scoped to Application
    Blueprints (the things a customer installs), separate from
    control-plane patterns documented in SECURITY.md and
    ARCHITECTURE.md.
  - §4: alert-to-action mapping segmented by Catalyst control plane
    vs per-product (Cortex, Fingate); new alerts: OpenBaoSealed,
    JetstreamLagHigh.
  - §7-§13: terminology aligned to Catalyst (console instead of IDP);
    runbooks now Runbook CRD-backed; incident severities updated.
  - §13.2-13.3: Catalyst-specific incidents (workspace-controller,
    OpenBao seal, projector lag) plus AI Hub incidents under
    bp-cortex installation.

Refs #37
2026-04-27 20:06:20 +02:00
hatiyildiz
039a724f31 docs: rewrite repository foundation around Catalyst as the platform
Repositions the public repo's identity. OpenOva is the company; Catalyst
is the platform. Sovereign is a deployed Catalyst. The historical
positioning (OpenOva = platform, Catalyst = bootstrap+IDP+lifecycle
sub-product) is retired. Catalyst now subsumes bootstrap, lifecycle, and
IDP responsibilities into one control plane.

- README.md             Catalyst-first front door. Sovereign concept,
                        repo structure, stack at a glance, cloud
                        provider matrix, getting-started paths
                        (managed via marketplace.openova.io vs
                        self-host via catalyst-provisioner).

- CLAUDE.md             Codebase guide for Claude. Banned-term table,
                        commit conventions (hatiyildiz default for
                        public repo), the no-fourth-surface rule,
                        per-component README rule of thumb.

- .claude/project-memory.md   Reduced to an index + decision log;
                        full architecture moved to docs/. Stack
                        decisions locked (NATS JetStream, OpenBao,
                        SPIFFE/SPIRE, per-Org Keycloak SME / per-
                        Sovereign corporate, Crossplane only IaC,
                        no Terraform/Pulumi user-facing surface).

- core/README.md        Catalyst control-plane Go application. Drops
                        the bootstrap-vs-manager split (both fold under
                        "Catalyst control plane"). Lists each component
                        deployable from this codebase: console,
                        marketplace, admin, projector, catalog-svc,
                        provisioning, workspace-controller, blueprint-
                        controller, billing. CRD list updated:
                        Sovereign / Organization / Environment /
                        Application / Blueprint / EnvironmentPolicy /
                        SecretPolicy / Runbook.

Refs #37
2026-04-27 20:05:58 +02:00
hatiyildiz
217c882916 docs(naming): rename {env}→{env_type}, add Organization + vcluster + Catalyst Environment layers
The naming convention pre-dates vcluster and Catalyst's user-facing
Environment object. Three additions, one rename:

- §2.4: {env} dimension renamed to {env_type} to disambiguate from the
  Catalyst Environment object (which is the user-facing scope, not a
  dimension).

- §2.5: new Organization dimension (slug, lowercase, hyphenated). Used
  for vcluster identity and any Organization-scoped resource.

- §4.7: new vcluster naming layer. Pattern is just {org} within the
  parent host cluster (Don't Repeat the Parent — Principle 1.2). Globally-
  qualified form is {prov}-{reg}-{bb}-{env_type}-{org} for cross-cluster
  references and kubeconfig contexts.

- §11: Catalyst Environment defined as the user-facing {org}-{env_type}
  scope. One Environment is realized by N vclusters across regions × bb
  filtered by Application Placement. Each Environment has its own Gitea
  repo and JetStream Account.

Tags updated: openova.io/environment → openova.io/env-type for
disambiguation; new openova.io/organization, openova.io/vcluster,
openova.io/environment (for Catalyst scope), openova.io/sovereign tags.

DNS pattern §5 split into two: control-plane (component.{location-code}.
{sovereign-domain}) and Application (app.{environment}.{sovereign-or-org-
domain}) — supporting white-label Sovereigns where the Application DNS
uses the customer's own domain.

Refs #37
2026-04-27 20:05:42 +02:00
hatiyildiz
d51a3fba4d docs: add canonical Catalyst documentation set
Six new docs that establish the unified Catalyst model — Sovereign as
deployed instance, Organization as multi-tenancy unit, Environment as
{org}-{env_type} scope, Application as user-facing handle, Blueprint as
unified module+template successor.

- docs/GLOSSARY.md           single source of truth for terminology;
                             every other doc defers to it; banned terms
                             (tenant, operator-as-entity, module, template,
                             Backstage, etc.) listed with replacements.

- docs/ARCHITECTURE.md       overall Catalyst architecture: control plane
                             vs application Blueprints, write path
                             (Git → Flux → K8s + Crossplane), read path
                             (CQRS via NATS JetStream → projector → SSE),
                             SPIFFE/SPIRE workload identity, OpenBao
                             independent Raft per region (no stretched
                             cluster), Keycloak per-Org (SME) vs
                             per-Sovereign (corporate).

- docs/PERSONAS-AND-JOURNEYS.md   personas × journeys matrix; only
                             three first-class surfaces (UI, Git, API);
                             explicit removal of Terraform/Pulumi/CLI as
                             user-facing IaC; Application card anatomy.

- docs/SECURITY.md           identity (workload + user), OpenBao + ESO
                             credential flow, dynamic credentials with
                             auto-rotation sidecar, multi-region
                             OpenBao (independent Raft per region with
                             async perf replication — explicitly NOT
                             stretched), rotation policy CRDs, threat
                             model.

- docs/SOVEREIGN-PROVISIONING.md   Phase 0 (catalyst-provisioner +
                             OpenTofu one-shot) → Phase 1 (Crossplane
                             adopts) → Phase 2 (self-sufficient Catalyst
                             control plane); air-gap procedure;
                             Organization migration; decommission.

- docs/BLUEPRINT-AUTHORING.md   Blueprint CRD spec, configSchema,
                             placementSchema, depends, manifests,
                             overlays; Crossplane Composition authoring
                             for non-K8s; signing/publishing pipeline;
                             public vs private (Org-scoped) visibility;
                             contribution path.

Refs #37
2026-04-27 20:05:25 +02:00
e3mrah
69706a80ec feat(axon): make qwen3-coder thinking mode toggleable via request parameter
Client sends `thinking: true` to enable reasoning tokens. Default remains
disabled for instant streaming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 09:20:33 +02:00
e3mrah
63fc7a381f fix(axon): disable qwen3-coder thinking mode for instant streaming
Qwen3-coder generates hundreds of `reasoning` tokens before `content`
tokens, causing 10+ second perceived delay. The reasoning tokens stream
through Axon but the ChatWidget only renders `delta.content`, so users
see a long pause then a burst. Passing `enable_thinking: false` via
chat_template_kwargs skips the reasoning phase entirely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 09:08:47 +02:00
e3mrah
5201bdc962 fix(axon): tighten WAF payload limits — system 4000, assistant 800, total 8000
3-turn conversations passed at ~9120 chars but 4-turn failed at ~10640.
WAF anomaly threshold is between those values. Lowered all limits to keep
multi-turn conversations well under the threshold.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-26 08:52:04 +02:00