openova

Author	SHA1	Message	Date
e3mrah	59cdfe5a77	docs: ADR-0002 + ARCHITECTURE §11.1 + Inviolable #11 — post-handover sovereignty cutover (#794 ) (#797 ) Adds the documentation set for the self-sovereignty cutover seam: - NEW docs/adr/0002-post-handover-sovereignty-cutover.md following ADR-0001's shape (Status, Context, Decision, Consequences, Alternatives Considered). Documents the 8-tether map, the 30/70 provisioning split, the operator-driven trigger model, and the egress-block DoD proof. - ARCHITECTURE.md §11 now carries a §11.1 Phase 2 — Self-Sovereignty Cutover subsection with the 8-Job table, mermaid Phase-0 → Phase-1 → Handover → Phase-2 → Day-2 diagram, and links to issues #790/#791/#792/#793/#794. - INVIOLABLE-PRINCIPLES.md adds Principle #11: Sovereigns must be independent of openova-io after handover. Trigger phrase, cold-start exception, and cutover requirement spelled out. Cites #790 (umbrella), #791 (chart), #792 (api), #793 (ui), #794 (this PR). Extends, does not contradict, ADR-0001 §11 (Catalyst-on-Catalyst) and §2 (Inviolable Principles). Closes #794 Co-authored-by: Hatice Yildiz <hatice.yildiz@openova.io>	2026-05-04 21:23:29 +04:00
e3mrah	53bc4357ca	feat(provisioner): cluster-autoscaler-hcloud + wizard footprint estimate (closes #767 ) (#776 ) * feat(provisioner): cluster-autoscaler-hcloud + wizard footprint estimate (closes #767) Two-pronged fix for the FailedScheduling pattern that hit otech92 (2x cpx32 workers couldn't fit external-secrets-webhook because the bootstrap-kit ate the full 16 GB): 1. PRE-LAUNCH ESTIMATE — wizard StepReview now surfaces a "Footprint estimate" Section with: bootstrap-kit baseline (sum of mandatory-tier component footprints), selected components delta, control-plane overhead, and a "Recommended N x <SKU>" line that turns amber when the operator's chosen worker count is below the rollup. Backed by per-component RAM/CPU floors in components/wizard/steps/componentFootprints.ts (covered by 12 unit tests including the otech92 reproduction). 2. RUNTIME AUTOSCALING — new bp-cluster-autoscaler-hcloud Blueprint added at bootstrap-kit slot 40. Wraps the upstream kubernetes/autoscaler chart 9.46.6 (appVersion 1.32.0) with the Hetzner cloud-provider. Token wired from the canonical flux-system/cloud-credentials.hcloud-token Secret cloud-init writes (mirrors the velero/harbor object-storage pattern). Pinned to the control-plane node so the autoscaler never schedules onto a worker it could itself terminate. 10-minute scale-down idle as the cost-saving default. Documented in docs/ARCHITECTURE.md sec.14 (Autoscaling) — explains how VPA / HPA / KEDA / cluster-autoscaler compose, why we picked cluster-autoscaler over KEDA for cluster scaling, and the bounds + safety story. Per the issue's MVP scope, this PR ships the blueprint + StepReview estimate WITHOUT the wizard StepProvider min/max pair refactor or the tofu node-pool template restructuring. Those are tracked as a follow-up issue (scope-control rule per docs/INVIOLABLE-PRINCIPLES.md #1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(provisioner): move cluster-autoscaler to slot 50 + register in expected-bootstrap-deps Slot 40 was already forward-declared for bp-llm-gateway in scripts/expected- bootstrap-deps.yaml — the dependency-graph-audit CI check fired on PR #776 because the file existed without a matching entry in the expected DAG, AND collided with a reserved slot. Move to slot 50 (after the W2.K4 cohort + slot 49 bp-cert-manager-powerdns-webhook) and add the matching entry to the expected-bootstrap-deps.yaml so the audit passes. `scripts/check-bootstrap-deps.sh` runs clean locally now (drift=0, cycles=0). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 19:49:44 +04:00
hatiyildiz	04559e5c37	docs(reconcile-pass-1): align docs with ground truth at `dd578d1c` Reconcile Pass 1 — first holistic LLM-driven reconciliation pass per ~/.claude/skills/reconcile-catalyst-docs/SKILL.md. Skill triggered after the post-Group-M architectural batch (#161, #162, #163, #167, #168, #169, #170, #171, #173, #174, #175). Live ground truth verified against kubectl + ls platform/ + git log + GHCR + componentGroups.ts. Drift categories fixed: - A. Numerical: bp-powerdns 1.0.5 → 1.0.6; component-logos 63 → 62 (powerdns SVG missing, tracked under #173); bootstrap kit 11 → 12 with bp-powerdns added per #167. - B. Service: pool-domain-manager + 5 registrar adapters (Cloudflare/Namecheap/GoDaddy/OVH/Dynadot, #170) added to IMPLEMENTATION-STATUS, ARCHITECTURE, PLATFORM-TECH-STACK, GLOSSARY, and PROVISIONING-PLAN; bp-powerdns added to ARCHITECTURE bootstrap kit + Catalyst-on-Catalyst dependency tree. - C. Architectural: SOVEREIGN-PROVISIONING §3 + DEMO-RUNBOOK Step 4 + ORCHESTRATOR-STATE Step 6 rewritten from Dynadot-direct DNS writes to PowerDNS authoritative + PDM /v1/commit + registrar-adapter NS-flip; PROVISIONING-PLAN Phase 4 paths corrected to products/catalyst/bootstrap/api/ (per INVIOLABLE-PRINCIPLES #3 the Go provisioner does NOT call cloud APIs); Phase 6 retitled and rewritten for the new DNS architecture. - D. Process: RUNBOOK-PROVISIONING §2 wizard-step table + DEMO-RUNBOOK Step 2 wizard-step table updated to canonical 7-step ordering (Org → Domain → Topology → Provider → Credentials → Components → Review per WIZARD_STEPS in WizardLayout.tsx, post #169 + #174); the three-mode StepDomain (pool / byo-manual / byo-api per #169) and two-tab StepComponents (mandatory infra + apps per #161/#162/#175) now documented. - E. Cross-doc: Group G ✅ across PROVISIONING-PLAN + ORCHESTRATOR-STATE (superseded by #167+#163+#170, not by the original Dynadot-multi-domain plan); Group C ✅ in PROVISIONING-PLAN (Flux is reconciling from openova-public today); README Stack-at-a-glance DNS row expanded. - F. Stale terminology: 11-grep banned-terms scan clean — every k8gb residual is a legitimate "removed at #171, replaced by lua-records" reference. VALIDATION-LOG.md gains the Reconcile Pass 1 entry per skill spec. Reconcile-skill numbering is independent of the Audit-skill numbering (which continues at Pass 108+). Files: 13 docs + VALIDATION-LOG entry. Escalations: none.	2026-04-29 09:40:10 +02:00
hatiyildiz	f5daac52af	refactor(platform): remove k8gb — replaced by PowerDNS lua-records (#171 ) PowerDNS lua-records (`ifurlup`, `pickclosest`, `ifportup`) cover everything k8gb was doing — geo-aware response selection, health-checked failover, weighted round-robin — at the authoritative DNS layer. Eliminates a separate K8s controller, CRD set, and CoreDNS plugin from every Sovereign. Changes: - platform/k8gb/ deleted (Chart.yaml, values.yaml, blueprint.yaml never authored — only README existed) - products/catalyst/bootstrap/ui/public/component-logos/k8gb.svg deleted - componentGroups.ts: remove k8gb component (PowerDNS already there) - componentLogos.tsx: drop logo_k8gb + k8gb map entry - model.ts DEFAULT_COMPONENT_GROUPS spine: replace k8gb with powerdns - StepInfrastructure.tsx: copy refers to PowerDNS lua-records, not k8gb - provision.html: replace k8gb tile and edges with powerdns - catalog.generated.ts regenerated (now includes bp-powerdns) - docs sweep — every k8gb reference in PLATFORM-TECH-STACK, NAMING- CONVENTION, SOVEREIGN-PROVISIONING, SRE, ARCHITECTURE, GLOSSARY, COMPONENT-LOGOS, IMPLEMENTATION-STATUS, BUSINESS-STRATEGY, TECHNOLOGY-FORECAST, README, infra/hetzner/README, platform READMEs (cilium, external-dns, failover-controller, litmus, flux, opentofu) rewritten to point at PowerDNS lua-records / MULTI-REGION-DNS.md. Historical entries in VALIDATION-LOG.md preserved as audit trail. - New docs/MULTI-REGION-DNS.md — canonical reference for the lua-record patterns (ifurlup all/pickclosest/pickfirst, ifportup, pickwhashed), Application Placement → lua-record selector mapping, when to add a second Sovereign region, operational checks. Closes #171. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 08:51:09 +02:00
hatiyildiz	7cafa3c894	docs(seaweedfs+guacamole): replace MinIO with SeaweedFS as unified S3 encapsulation; add Guacamole to bp-relay Component-level architectural correction (two changes): 1. MinIO → SeaweedFS as unified S3 encapsulation layer The old design used MinIO for in-cluster S3 plus separate cold-tier configuration scattered across consumers. The new design positions SeaweedFS as the single S3 encapsulation layer: every Catalyst component talks to one endpoint (seaweedfs.storage.svc:8333). SeaweedFS internally handles hot tier (in-cluster NVMe), warm tier (in-cluster bulk), and cold tier (transparent passthrough to cloud archival storage — Cloudflare R2 / AWS S3 / Hetzner Object Storage / etc., chosen at Sovereign provisioning). One audit/lifecycle/encryption boundary instead of N. No Catalyst component talks to cloud S3 directly anymore — Velero, CNPG WAL archive, OpenSearch snapshots, Loki/Mimir/Tempo, Iceberg, Harbor blob store, Application buckets all share one S3 surface. 2. Apache Guacamole added as Application Blueprint §4.5 Communication Clientless browser-based RDP/VNC/SSH/kubectl-exec gateway. Keycloak SSO, full session recording to SeaweedFS for compliance evidence (PSD2/DORA/SOX). Composed into bp-relay. Replaces VPN+native-client distribution for auditable remote access. Component changes: - DELETED: platform/minio/ - CREATED: platform/seaweedfs/README.md (unified S3 + cold-tier encapsulation; bucket layout; multi-region replication via shared cold backend; migration-from-MinIO section) - CREATED: platform/guacamole/README.md (clientless remote-desktop gateway; GuacamoleConnection CRD; compliance integration via session recordings) Doc updates: PLATFORM-TECH-STACK §1+§3.5+§4.5+§5+§7.4; TECHNOLOGY-FORECAST L11+mandatory+a-la-carte counts (52 → 53); ARCHITECTURE §3 topology; SECURITY §4 DB engines; SOVEREIGN-PROVISIONING §1 inputs; SRE §2.5+§7; IMPLEMENTATION-STATUS §3; BLUEPRINT-AUTHORING stateful examples; BUSINESS-STRATEGY 13 component-count anchors + Relay product line; README.md backup row; CLAUDE.md folder count. Component README updates (S3 endpoint + dependency renames): cnpg, clickhouse, flink, gitea, iceberg, harbor, grafana, livekit, kserve, milvus, opensearch, flux, stalwart, velero (substantive rewrite of velero — now writes exclusively to SeaweedFS with cold-tier auto-routing). Products: relay, fabric. UI scaffold: products/catalyst/bootstrap/ui/src/shared/constants/components.ts — minio entry replaced with seaweedfs; velero+harbor deps updated; new guacamole entry added. VALIDATION-LOG entry "Pass 104 — MinIO → SeaweedFS swap + Guacamole add" captures the encapsulation principle and adds Lesson #22: storage tier policy belongs at the encapsulation boundary, not inside every consumer. Verification: zero remaining MinIO references in canonical docs (one intentional retention in TECHNOLOGY-FORECAST L37 explaining the swap); 53 platform/ folders matching all "53 components" anchors; bp-relay composition includes guacamole.	2026-04-28 10:23:46 +02:00
hatiyildiz	0a6179dd21	docs(unified-repo-model): collapse SME and corporate to one shape — Application = Gitea Repo Architectural correction. Replaces the previous "one Gitea repo per Environment with Apps as folders" rule with a single uniform shape that scales by configuration only: - Catalyst Application = one Gitea Repo (always, regardless of scale) - Branches develop/staging/main map to dev/stg/prod environments - 5 conventional Gitea Orgs per Sovereign: catalog (public mirror), catalog-sovereign (Sovereign-curated private Blueprints), one per Catalyst Organization (with shared-blueprints + N App repos), system (sovereign-admin scope) - EnvironmentPolicy CR lives in system/catalyst-config/policies/, same shape for SME and corporate; only field values differ Removes the SME-vs-corporate dual-shape design that violated the "Application is application" invariant. Teams primitive (proposed for corporate scale) is dropped — team boundaries emerge from CODEOWNERS at the App-repo level. RE-score thresholds and EnvironmentPolicy fields are universal defaults; only their values vary per Org's policy choice. Files updated line-by-line: GLOSSARY (Application + Environment definitions, new Gitea-Orgs section, 6 component-row updates), NAMING §11.2 (Realization 7-bullet rewrite), ARCHITECTURE (§1, §3 topology, §4 write-side ASCII, §7.1+§7.2+§7.3, §8 promotion, §9 multi-App linkage), PERSONAS-AND-JOURNEYS (§2 surfaces, §4.1 Ahmed, §4.2 Layla full rewrite), BLUEPRINT-AUTHORING §1 (catalog-sovereign source location), PLATFORM-TECH-STACK §2.2+§2.3, SECURITY §3, SOVEREIGN-PROVISIONING §5+§8+§10, IMPLEMENTATION-STATUS §5, SRE §14. VALIDATION-LOG entry "Pass 103 — UNIFIED REPO MODEL REFACTOR" captures the architectural correction and acknowledges the prior 102-pass audit anchored on the wrong shape (text-shape consistency was correct; the chosen text-shape was inadequate). Lesson #21 added: text-shape audits don't substitute for architectural review. Verification: zero remaining old-model assertions in canonical docs (grep clean for 'Environment Gitea repo', '/{org}/{org}-{env_type}', 'per-Environment Gitea repos', 'applications/<app>/values', etc.).	2026-04-28 10:13:02 +02:00
hatiyildiz	9af6717dcc	docs(pass-61): ARCHITECTURE §4 box alignment (Pass 29 carry-over); cnpg clean ARCHITECTURE §4 (Write side) box at L121 had alignment drift from Pass 29's expansion to canonical FQDN form. Line content reached 89 chars while box border was 74 chars — overflow. Same drift category as Pass 53's §8 acme-stg alignment fix. Fixed by replacing the in-box content with a shorter form pointing to NAMING §11.2 for the FQDN (already canonical there + 4 other places): - Old: │ Gitea: gitea.<location-code>.<sovereign-domain>/{org}/{org}-{env_type} │ (89 chars) - New: │ Environment Gitea repo: {org}/{org}-{env_type} │ │ (FQDN form per NAMING §11.2) │ Also normalized whitespace padding across L122-L130 (uniform 76 chars). ARCHITECTURE §1-§14 third-cycle deep re-scan with all current methodology lenses confirmed otherwise clean. §5 <env> shorthand explicitly defined, §9 catalyst.openova.io/v1alpha1 canonical, §10/§11/§12 all consistent with downstream canonical references. platform/cnpg/README.md: clean. Banner correct (§4.1 Data services). namespace: databases ✓, minio.storage.svc ✓, postgres.<env>.<sovereign-domain> ✓ (Pass 35 fix held). Cross-region DR example uses canonical Application DNS — no Pass-60-style fully-qualified-hostname drift. Methodology lesson #19: Pass-N expansion of placeholder-to-canonical-form inside ASCII tables/diagrams must verify box alignment afterward. Pass 29 expansion broke alignment at §4 (this pass) and §8 (Pass 53).	2026-04-28 01:35:35 +02:00
hatiyildiz	bb15e03884	docs(pass-53): ARCHITECTURE §8 column alignment (Pass 39 carry-over); langfuse clean ARCHITECTURE §8 (Promotion across Environments) L287 had column- alignment drift from Pass 39's `replace_all acme-staging → acme-stg`. The 12-char acme-staging filled the column padding; the 8-char acme-stg shifted "1.3.0" left of the adjacent "1.4.0"/"1.2.0" values. PERSONAS-AND-JOURNEYS L230 had the same Pass 39 fix but I'd done that as an explicit Edit with proper padding; ARCHITECTURE used replace_all which produced misaligned 7-space gap. Fixed: acme-stg padded to acme-stg + 11 spaces (was 7) so all four rows in the §8 mockup table align at the version column. Methodology lesson #17: replace_all on shorter strings inside ASCII code-block tables silently breaks column alignment. Greps can't detect whitespace-alignment drift; manual column-check after replace_all is needed. ARCHITECTURE.md §1-§14 deep re-scan with all current lessons: - §3 Topology: 15-component Catalyst control plane matches PTS §2 union (post-Pass 40). Per-host-cluster list omits OpenTofu (bootstrap-only/not-runtime) defensibly. - §5 explicitly defines <env> as {org}-{env_type} — anchors the ws.<env>.> shorthand Pass 30 noted. - §10 11-component bootstrap kit matches SOVEREIGN-PROVISIONING §3. - §11 bp-catalyst-* list matches IMPLEMENTATION-STATUS §2. - §12 Independent-failure-domains cites OpenBao per-region Raft ✓. platform/langfuse/README.md: clean. Banner correct (§4.7 AI Observability). Distinguishes per-host-cluster Grafana stack from Application-level LangFuse correctly. Drift found. Consecutive-clean count remains 0 but drift surface shifting toward cosmetic territory (column alignment, freshness) rather than architectural.	2026-04-28 00:44:24 +02:00
hatiyildiz	9ae1531878	docs(pass-39): non-canonical *-staging env_type drift; clickhouse clean NAMING §2.4 establishes the 3-char env_type form (prod\|stg\|uat\|dev\|poc) but multiple Environment-name examples used the long form `staging`. ARCHITECTURE.md §8 (Promotion across Environments): 3 instances of acme-staging (Blueprint detail mockup L287, prose L295, EnvironmentPolicy sourceEnvironment L310) renamed to acme-stg. PERSONAS-AND-JOURNEYS.md: 3 instances renamed — - digital-channels-staging → digital-channels-stg (Layla narrative L126, L135) - acme-staging → acme-stg (Blueprint detail mockup L230) Pass 33 fixed Layla's DNS but left the env_type spelling. Preserved: payment-rail-staging (Application name, free-form per NAMING) and minimum-replicas-production (Kyverno policy identifier). ARCHITECTURE.md deep re-scan with Pass 23 lesson (focus on later sections): §5-§13 substantively clean. §5 explicitly defines <env> as {org}-{env_type} which retroactively grounds the ws.<env>.> shorthand Pass 30 noted as "documented shorthand". platform/clickhouse/README.md: clean. minioadmin literal placeholder flagged for future security-hardening pass but not Catalyst drift.	2026-04-27 23:07:11 +02:00
hatiyildiz	4793cab8b6	docs(pass-29): DNS-placeholder sweep across canonical docs The recurring drift: Catalyst control-plane DNS placeholders that omit the <location-code> segment, producing forms like gitea.<sovereign>, gitea.<sovereign>.<domain>, gitea.<sovereign-domain>, keycloak.<domain>. Per NAMING §5.1 the canonical form is {component}.{location-code}.{sovereign-domain} (e.g. gitea.hfmp.openova.io). The shorter forms aren't just abbreviations — they collapse the multi-region location dimension and re-drift every time a reader reads them as obvious shorthand. Fixes: - CLAUDE.md "Customer Sync" — both gitea.<sovereign>/catalog/... lines. - docs/SOVEREIGN-PROVISIONING.md §3 DNS-records bullet (3 lines) + §5 Day-1 login line. - docs/ARCHITECTURE.md §4 write-path Gitea label. - docs/BLUEPRINT-AUTHORING.md §6.4 private-Blueprint Studio target. - platform/librechat/README.md Keycloak issuer (Pass 22 marked clean and missed this — banner scans miss YAML-block drift). platform/nemo-guardrails/README.md verified clean. Final grep confirms only canonical forms remain. Validation log Pass 29 entry added with the recurring-drift-pattern note for future passes.	2026-04-27 22:30:41 +02:00
hatiyildiz	eff264b077	docs(pass-17): ARCHITECTURE OAM table pipe-fix + Harbor README de-drift Pass 17 — drift-detection sweep on ARCHITECTURE + harbor. Two real findings. ARCHITECTURE §13 (OAM table): - `\| Trait \| Blueprint overlay (`overlays/small\|medium\|large`) \|` has pipe chars inside backticks inside a Markdown table cell — a known GFM rendering hazard. Replaced with comma-separated examples. platform/harbor/README.md: - The banner added in Pass 9 said "every host cluster runs a Harbor instance" but the body still described an older "Harbor Primary / Harbor Replica" cross-region replication topology. Same shape of architectural drift Pass 7 caught in OpenBao/ESO/Gitea/Flux — banner-add doesn't rewrite the body. - Three sections rewritten: * Overview mermaid: now shows upstream-OCI → multiple independent per-cluster Harbors with local Trivy scan + local Pod pulls. * "Multi-Region Replication" → "Per-host-cluster mirroring (NOT primary-replica)". Single source of truth = upstream OCI (ghcr.io/openova-io/* for Catalyst+Blueprints, customer CI for application images), not a "primary Harbor". * Example replication policy: was a `dest_registry` cross-region push policy → now a pull-mirror policy from ghcr.io with scheduled-cron trigger. - "Why Mandatory" table reframed in per-host-cluster terms. VALIDATION-LOG: Pass 17 entry added with the specific drift-detection lesson — banner-addition passes don't catch body-level drift; need explicit body re-reads. Refs #37	2026-04-27 21:58:53 +02:00
hatiyildiz	fec0c342a8	docs(pass-6): reconcile topology diagram + unify JetStream Account scoping Pass 6 — fresh-eyes line-by-line read of ARCHITECTURE.md. Found two internal contradictions that earlier passes missed. ARCHITECTURE §3 (topology diagram) listed Crossplane, Flux, Harbor, and grafana-stack INSIDE the Catalyst control plane block. But §11 (Catalyst-on-Catalyst) explicitly says these are per-host-cluster infrastructure, NOT Catalyst control-plane components. PLATFORM-TECH- STACK §3 also classifies them as per-host-cluster. Fixed: §3 topology diagram now shows only true Catalyst control-plane components (console, marketplace, admin, catalog-svc, projector, provisioning, environment-controller, blueprint-controller, billing, gitea, nats-jetstream, openbao, keycloak, spire-server, observability) and adds a separate line for "Plus per-host-cluster infrastructure" that defers to PLATFORM-TECH-STACK §3 for the full list (Cilium, Flux, Crossplane, cert-manager, ESO, Kyverno, Harbor, Reloader, Trivy, Falco, Sigstore, Syft+Grype, VPA, KEDA, External-DNS, k8gb, Coraza, MinIO, Velero, failover-controller). Also added the previously-missing `provisioning` row. JetStream Account scoping was contradictory: - ARCHITECTURE §5 said "Per-Org account: ws.{org}-{env_type}.>" — reads ambiguously: is the Account per-Org or per-Env? - NAMING-CONVENTION §11.2 said "One JetStream Account scoped to ws.{org}-{env_type}.>" — implied per-Environment. - GLOSSARY + PLATFORM-TECH-STACK + SECURITY all say per-Organization. Reconciled to the per-Org-Account-with-per-Env-subjects model: - Account isolation: ONE NATS Account per Organization. - Subjects within the Account use prefix `ws.{org}-{env_type}.>` for per-Environment partitioning. This is the cleanest isolation model: Accounts are NATS' strongest isolation boundary (per-Org); subjects partition further within each Account (per-Env). Refs #37	2026-04-27 21:30:03 +02:00
hatiyildiz	ba048d2fd7	docs(pass-5b): scrub remaining "instance" usages where "Application" is meant Two user-facing residuals where the banned product term "instance" slipped through: - docs/ARCHITECTURE.md §9: example console dialog "Use existing instance or create a dedicated one?" → "Use an existing Postgres Application or create a new dedicated one?". This is a UI prompt text — must use the user-facing noun "Application", not "instance". - docs/NAMING-CONVENTION.md §6.2 tag comment: "Application instance name" → "Application name within the Environment". The CRD might internally still use the noun Instance for class-vs-instance semantics, but in tag annotations and user-visible context the Application IS the instance. Other "instance" occurrences confirmed legitimate (Postgres instance as Crossplane resource type, Flux instance as software deployment, EC2/Hetzner instance as cloud-provider terminology) and retained. Final cross-reference check: all Markdown links across all canonical docs resolve. No residual banned terms. Refs #37	2026-04-27 21:26:27 +02:00
hatiyildiz	79c59a27a2	docs(pass-5): reconcile Phase-0 install order, IMPLEMENTATION-STATUS section numbering Pass-5A — fresh-eyes deep read found two structural drifts. ARCHITECTURE §10 Phase-0 install order: - Old order: cert-manager → Cilium → Flux → ... → Catalyst control plane. - SOVEREIGN-PROVISIONING §3 has the correct order: Cilium first (CNI must be in place before pods can network), THEN cert-manager. - ARCHITECTURE updated to match: Cilium → cert-manager → Flux → Crossplane → Sealed Secrets → SPIRE → JetStream → OpenBao → Keycloak → Gitea → Catalyst control plane (11 items, matching the SOVEREIGN-PROVISIONING list which had Keycloak and Gitea spelled out separately). IMPLEMENTATION-STATUS section numbering: - Old: §1 → §2 → §2bis → §3 → §4 → §5 → §6 → §7 → §8. The "§2bis" was a workaround for inserting per-host-cluster infrastructure without renumbering. Reads weird. - New: §1 → §2 → §3 → §4 → §5 → §6 → §7 → §8 → §9. Clean numbering. Refs #37	2026-04-27 21:25:07 +02:00
hatiyildiz	d1a2ed73a3	docs(pass-4): align ARCHITECTURE phase numbering with SOVEREIGN-PROVISIONING ARCHITECTURE §10 listed 3 provisioning phases (Phase 0 / 1 / 2) and labeled Phase 2 as "Self-sufficient". SOVEREIGN-PROVISIONING.md uses 4 phases (Phase 0 Bootstrap / Phase 1 Hand-off / Phase 2 Day-1 setup / Phase 3 Steady-state). The same phase number meant different things in the two docs. Aligned ARCHITECTURE to the 4-phase numbering. SOVEREIGN-PROVISIONING is now explicitly the canonical reference for phase semantics. Refs #37	2026-04-27 21:22:07 +02:00
hatiyildiz	80b91709e1	docs(iter-3-5): purge operator-as-entity, fix Workspace-controller capital, JetStream KV references ARCHITECTURE (iter 3): - Removed catalystctl from the §4 write-side diagram (it's read-only; presenting it as a write input contradicted §7.4). - "Both tabs read the same Valkey snapshot" → "JetStream KV snapshot" in §5 (Valkey is no longer in the control plane). - §7.4: catalystctl reframed as "may exist as small read-only debug CLI" rather than implying it ships today. - §11 dependency list: added bp-catalyst-provisioning; removed bp-catalyst-crossplane (Crossplane is per-host-cluster infra, not a Catalyst control-plane component); added clarifying note. - §12 CRD list: added SecretPolicy + Runbook (were already in IMPLEMENTATION-STATUS but missing from the principles table). - §2 SME-style description: "SaaS Operator team (Omantel staff)" → "SaaS provider's cloud team" (Operator banned as entity). NAMING-CONVENTION (iter 4): - §5.1 heading "operator domain" → "Sovereign domain". - §7 multi-region diagram: replaced piecemeal Catalyst component list with a deferral to PLATFORM-TECH-STACK §2; added SPIRE server; fixed "per-Org workspaces" → "per-Environment Gitea repos"; added per-host-cluster infrastructure callout. SECURITY (iter 6 — partial; fold into this commit): - "operator-approved" → "sovereign-admin-approved" for DR promotion. - Realm name "catalyst-operator" → "catalyst-admin" (entity-noun scrubbed from the realm naming itself). SOVEREIGN-PROVISIONING (iter 7 — partial): - "single operator's laptop" → "single person's laptop" (avoid "operator" as entity). - "the next operator" → "the next Sovereign provisioning request, regardless of who initiates it". - "catalyst-operator realm" → "catalyst-admin realm" (×2). - Capital-W "Workspace-controller" residuals (3) → "Environment- controller" (replace_all is case-sensitive; previous iter caught lowercase only). PERSONAS (iter 5): - P3 "within a Sovereign Operator team" → "within a Sovereign's operations team". - Two capital-W "Workspace-controller" residuals fixed. SRE (iter 11 — partial): - §13.2 "Workspace-controller stuck" runbook entry → "Environment-controller stuck". Banned-term sweep result post-fix: no `Operator team\|role\|account\| user\|admin` anywhere; no capital-W Workspace as Catalyst scope; no Valkey-as-control-plane refs. Refs #37	2026-04-27 21:09:31 +02:00
hatiyildiz	27325edb32	docs(iter-2): glossary alignment — rename workspace-controller, fix definitions GLOSSARY.md line-by-line audit. Eight corrections. 1. workspace-controller → environment-controller everywhere. The controller reconciles the Environment CRD; "workspace" is banned as a Catalyst scope, so it cannot be in a component name either. Fixed in: GLOSSARY, ARCHITECTURE, PLATFORM-TECH-STACK, NAMING-CONVENTION, SOVEREIGN-PROVISIONING, IMPLEMENTATION-STATUS, core/README, BUSINESS-STRATEGY. Banned-term entry in GLOSSARY now explicitly covers component names too. 2. "workspace repos" (per-Environment Gitea repos) → "Environment Gitea repos" in GLOSSARY, PLATFORM-TECH-STACK. 3. JWT claim {workspace, org, role} → {environment, org, role} in ARCHITECTURE projector diagram. 4. OpenOva definition refined: was "Never used to name a product", which contradicted "OpenOva Catalyst", "OpenOva Cortex". Now: brand prefix in product names; bare "OpenOva" = the company; bare "Catalyst" = the platform. 5. Catalyst definition completed: was missing provisioning, billing, gitea, observability — now lists all 14 control-plane components, pointing at the table below. 6. Catalyst components table: added `provisioning` (validates configSchema, commits to Environment Gitea); reordered to match ARCHITECTURE §3 grouping; clarified each component's source-of-truth (catalog-svc reads monorepo + Gitea, blueprint-controller watches monorepo + Gitea, etc.). 7. Environment definition: refers to NAMING §2.4 for env_type values; removed inline list that didn't match canonical ordering. Added concrete examples (acme-prod, acme-dev, bankdhofar-uat). 8. Application example: dropped "RocketChat" which appeared nowhere else; replaced with generic "running deployment" plus the established WordPress / Postgres examples. 9. sovereign-admin description: was "runs Crossplane" — Crossplane is platform plumbing not user-facing. Now: "manages the underlying clusters via Crossplane (which is platform plumbing, not a user-facing surface)". Banned-term coverage: - "Workspace" entry now covers BOTH the Catalyst scope AND component naming (workspace-controller → environment-controller). Refs #37	2026-04-27 21:06:09 +02:00
hatiyildiz	2c4902b409	docs(iter-1): add IMPLEMENTATION-STATUS, fix wrong-org refs, reconcile monorepo First validation iteration. Three concrete corrections. 1. Add docs/IMPLEMENTATION-STATUS.md as the bridge between target architecture and current code state. Status legend (✅ / 🚧 / 📐 / ⏸) applied per-component. Catalyst control plane = mostly 📐. Component READMEs = 🚧 (README only, no Blueprint manifests yet). products/axon = ✅ (only product with real code). core/ = 📐 (just .gitkeep). 2. Status banner added to ARCHITECTURE, SECURITY, SOVEREIGN-PROVISIONING, BLUEPRINT-AUTHORING, PERSONAS-AND-JOURNEYS, PLATFORM-TECH-STACK, SRE pointing readers at IMPLEMENTATION-STATUS.md before they treat any described feature as built. GLOSSARY also references it. 3. Architectural decision (Option A — monorepo canonical): - Each platform/<name>/ and products/<name>/ folder is the source of ONE Blueprint, published as ghcr.io/openova-io/<name>:<semver> by CI fan-out from the monorepo root. - BLUEPRINT-AUTHORING.md §1, §2, §13 rewritten to match. - README.md "what's in this repo" rewritten to clarify monorepo + OCI-fan-out shape; no longer claims every directory is a Blueprint in a way that contradicts BLUEPRINT-AUTHORING. Wrong-org fixes (3 places): - docs/PERSONAS-AND-JOURNEYS.md:13 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md:13 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md:404 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md ghcr.io/openova/* (3 refs) → openova-io API group consistency: - All references unified to catalyst.openova.io/v1alpha1 (was mixed v1 / v1alpha1; v1alpha1 is correct since the CRDs are design-stage with no implementation). core/README.md updated to honestly describe the directory tree as "target structure with .gitkeep placeholders" rather than implying the apps/console, apps/projector, etc. binaries already exist. The legacy apps/bootstrap and apps/manager directories are acknowledged as transitional placeholders that will be removed when the new apps/ layout is scaffolded. CLAUDE.md and .claude/project-memory.md updated to put IMPLEMENTATION-STATUS.md second in the read-first ordering. Refs #37	2026-04-27 20:43:31 +02:00
hatiyildiz	d51a3fba4d	docs: add canonical Catalyst documentation set Six new docs that establish the unified Catalyst model — Sovereign as deployed instance, Organization as multi-tenancy unit, Environment as {org}-{env_type} scope, Application as user-facing handle, Blueprint as unified module+template successor. - docs/GLOSSARY.md single source of truth for terminology; every other doc defers to it; banned terms (tenant, operator-as-entity, module, template, Backstage, etc.) listed with replacements. - docs/ARCHITECTURE.md overall Catalyst architecture: control plane vs application Blueprints, write path (Git → Flux → K8s + Crossplane), read path (CQRS via NATS JetStream → projector → SSE), SPIFFE/SPIRE workload identity, OpenBao independent Raft per region (no stretched cluster), Keycloak per-Org (SME) vs per-Sovereign (corporate). - docs/PERSONAS-AND-JOURNEYS.md personas × journeys matrix; only three first-class surfaces (UI, Git, API); explicit removal of Terraform/Pulumi/CLI as user-facing IaC; Application card anatomy. - docs/SECURITY.md identity (workload + user), OpenBao + ESO credential flow, dynamic credentials with auto-rotation sidecar, multi-region OpenBao (independent Raft per region with async perf replication — explicitly NOT stretched), rotation policy CRDs, threat model. - docs/SOVEREIGN-PROVISIONING.md Phase 0 (catalyst-provisioner + OpenTofu one-shot) → Phase 1 (Crossplane adopts) → Phase 2 (self-sufficient Catalyst control plane); air-gap procedure; Organization migration; decommission. - docs/BLUEPRINT-AUTHORING.md Blueprint CRD spec, configSchema, placementSchema, depends, manifests, overlays; Crossplane Composition authoring for non-K8s; signing/publishing pipeline; public vs private (Org-scoped) visibility; contribution path. Refs #37	2026-04-27 20:05:25 +02:00

19 Commits