Commit Graph

11 Commits

Author SHA1 Message Date
hatiyildiz
f5daac52af refactor(platform): remove k8gb — replaced by PowerDNS lua-records (#171)
PowerDNS lua-records (`ifurlup`, `pickclosest`, `ifportup`) cover everything
k8gb was doing — geo-aware response selection, health-checked failover,
weighted round-robin — at the authoritative DNS layer. Eliminates a
separate K8s controller, CRD set, and CoreDNS plugin from every Sovereign.

Changes:
- platform/k8gb/ deleted (Chart.yaml, values.yaml, blueprint.yaml never
  authored — only README existed)
- products/catalyst/bootstrap/ui/public/component-logos/k8gb.svg deleted
- componentGroups.ts: remove k8gb component (PowerDNS already there)
- componentLogos.tsx: drop logo_k8gb + k8gb map entry
- model.ts DEFAULT_COMPONENT_GROUPS spine: replace k8gb with powerdns
- StepInfrastructure.tsx: copy refers to PowerDNS lua-records, not k8gb
- provision.html: replace k8gb tile and edges with powerdns
- catalog.generated.ts regenerated (now includes bp-powerdns)
- docs sweep — every k8gb reference in PLATFORM-TECH-STACK, NAMING-
  CONVENTION, SOVEREIGN-PROVISIONING, SRE, ARCHITECTURE, GLOSSARY,
  COMPONENT-LOGOS, IMPLEMENTATION-STATUS, BUSINESS-STRATEGY,
  TECHNOLOGY-FORECAST, README, infra/hetzner/README, platform READMEs
  (cilium, external-dns, failover-controller, litmus, flux, opentofu)
  rewritten to point at PowerDNS lua-records / MULTI-REGION-DNS.md.
  Historical entries in VALIDATION-LOG.md preserved as audit trail.
- New docs/MULTI-REGION-DNS.md — canonical reference for the lua-record
  patterns (ifurlup all/pickclosest/pickfirst, ifportup, pickwhashed),
  Application Placement → lua-record selector mapping, when to add a
  second Sovereign region, operational checks.

Closes #171.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:51:09 +02:00
hatiyildiz
0a6179dd21 docs(unified-repo-model): collapse SME and corporate to one shape — Application = Gitea Repo
Architectural correction. Replaces the previous "one Gitea repo per Environment with Apps as folders" rule with a single uniform shape that scales by configuration only:

- Catalyst Application = one Gitea Repo (always, regardless of scale)
- Branches develop/staging/main map to dev/stg/prod environments
- 5 conventional Gitea Orgs per Sovereign: catalog (public mirror), catalog-sovereign (Sovereign-curated private Blueprints), one per Catalyst Organization (with shared-blueprints + N App repos), system (sovereign-admin scope)
- EnvironmentPolicy CR lives in system/catalyst-config/policies/, same shape for SME and corporate; only field values differ

Removes the SME-vs-corporate dual-shape design that violated the "Application is application" invariant. Teams primitive (proposed for corporate scale) is dropped — team boundaries emerge from CODEOWNERS at the App-repo level. RE-score thresholds and EnvironmentPolicy fields are universal defaults; only their values vary per Org's policy choice.

Files updated line-by-line: GLOSSARY (Application + Environment definitions, new Gitea-Orgs section, 6 component-row updates), NAMING §11.2 (Realization 7-bullet rewrite), ARCHITECTURE (§1, §3 topology, §4 write-side ASCII, §7.1+§7.2+§7.3, §8 promotion, §9 multi-App linkage), PERSONAS-AND-JOURNEYS (§2 surfaces, §4.1 Ahmed, §4.2 Layla full rewrite), BLUEPRINT-AUTHORING §1 (catalog-sovereign source location), PLATFORM-TECH-STACK §2.2+§2.3, SECURITY §3, SOVEREIGN-PROVISIONING §5+§8+§10, IMPLEMENTATION-STATUS §5, SRE §14.

VALIDATION-LOG entry "Pass 103 — UNIFIED REPO MODEL REFACTOR" captures the architectural correction and acknowledges the prior 102-pass audit anchored on the wrong shape (text-shape consistency was correct; the chosen text-shape was inadequate). Lesson #21 added: text-shape audits don't substitute for architectural review.

Verification: zero remaining old-model assertions in canonical docs (grep clean for 'Environment Gitea repo', '/{org}/{org}-{env_type}', 'per-Environment Gitea repos', 'applications/<app>/values', etc.).
2026-04-28 10:13:02 +02:00
hatiyildiz
c7a2fb05ea docs(pass-42): vague <sovereign-gitea> placeholders in BLUEPRINT-AUTHORING + NAMING; falco clean
Recurring drift category: vague composite placeholders like
<sovereign-domain-gitea> and <sovereign-gitea> standing in for the
canonical Catalyst control-plane DNS form gitea.{location-code}.{sovereign-domain}.
These survived Pass 29's DNS sweep because they don't match Pass 29's
grep patterns (<sovereign>.<domain>, <sovereign-domain>, etc.) —
different shape entirely (single hyphenated placeholder vs multi-segment).

BLUEPRINT-AUTHORING.md §1: <sovereign-domain-gitea>/<org>/shared-blueprints/bp-<name>/
→ gitea.<location-code>.<sovereign-domain>/<org>/shared-blueprints/bp-<name>/
plus inline pointer to NAMING §5.1.

NAMING-CONVENTION.md §11.2 step 1: <sovereign-gitea>/{org}/{org}-{env_type}
abstract pattern → gitea.{location-code}.{sovereign-domain}/{org}/{org}-{env_type}.
The authoritative naming doc was teaching a non-canonical shorthand
while its example showed the canonical form — second drift instance in
§11.2 (Pass 37 fixed example URL, Pass 42 fixes abstract pattern).

BLUEPRINT-AUTHORING.md §1-§14 deep re-scan: clean apart from §1 fix.
§8 Crossplane Compositions verified — compose.openova.io/v1alpha1 is
intentionally separate from catalyst.openova.io/v1alpha1 (Crossplane
XRDs use their own group; Pass 1's unification was for Catalyst's own
CRDs only).

platform/falco/README.md: clean.
2026-04-27 23:28:26 +02:00
hatiyildiz
7e40a65aba docs(pass-37): NAMING §11.2 example URL drift; cilium clean
Applied Pass 23 lesson (deep-read later sections of long canonical docs)
to NAMING-CONVENTION §7-§11. Found one drift instance in §11.2 — the
most authoritative passage on Environment realization.

§11.2 step 1 had example URL `gitea.omantel.openova.io/acme/acme-prod`
— a 3-segment form bypassing the `{location-code}` segment NAMING §5.1
itself establishes. The most concerning drift category: the authoritative
naming doc offering a non-canonical example.

Pass 29's earlier sweep caught placeholder forms (gitea.<sovereign>.<domain>
etc.) but missed this because it uses a literal Sovereign domain
(omantel.openova.io) completing a 3-segment form — evades any
placeholder-shape grep.

Fixed to `gitea.<location-code>.omantel.openova.io/acme/acme-prod` and
added inline pointer to §5.1.

platform/cilium/README.md: clean. Generic upstream K8s/Cilium patterns
in all examples; no Catalyst-specific drift.

Pattern note for future passes: drift sweeps should also grep for
literal canonical domains (omantel.openova.io, bankdhofar.local,
openova.io) to catch the literal-domain variant.

Sweep grep at end of pass: no other instances of literal-domain form
across canonical docs.
2026-04-27 22:51:52 +02:00
hatiyildiz
b467dc3f3b docs(pass-18): NAMING DR-as-env_type misexample + Keycloak deployment topology
Pass 18 — drift-detection on NAMING-CONVENTION + platform/keycloak.
Two real findings.

NAMING-CONVENTION §11.1:
- The example list of Catalyst Environments included `bankdhofar-dr`
  — but `dr` is NOT a valid env_type. Canonical values per §2.4 are
  prod / stg / uat / dev / poc. DR is a Placement mode
  (active-active / active-hotstandby across regions inside the
  *-prod Environment), not a separate Environment.
- Replaced `bankdhofar-dr` with `bankdhofar-uat` and added an
  explicit "DR is a Placement, not an Env Type" note.

platform/keycloak/README.md:
- Keycloak Deployment YAML example used `namespace: open-banking`
  with 2 replicas — Fingate-specific narrative that contradicted
  the per-Org / per-Sovereign topology stated in the banner.
  Rewrote with two side-by-side examples:
  * shared-sovereign (3 HA replicas, catalyst-keycloak namespace,
    CNPG-backed)
  * per-organization (1 replica in <org> namespace, optional
    embedded DB for smallest SME tier)
- HA section was a single set of claims (2+ replicas, CNPG, Infinispan)
  that only matched corporate. Now branches on topology — corporate
  gets HA + Infinispan, SME gets single replica with restart-on-
  deploy as acceptable for tier SLAs.

Same kind of drift Pass 17 caught in Harbor: banner says one thing,
body still describes the older model. Both fixed.

VALIDATION-LOG: Pass 18 entry added.

Refs #37
2026-04-27 22:00:42 +02:00
hatiyildiz
fec0c342a8 docs(pass-6): reconcile topology diagram + unify JetStream Account scoping
Pass 6 — fresh-eyes line-by-line read of ARCHITECTURE.md. Found two
internal contradictions that earlier passes missed.

ARCHITECTURE §3 (topology diagram) listed Crossplane, Flux, Harbor,
and grafana-stack INSIDE the Catalyst control plane block. But §11
(Catalyst-on-Catalyst) explicitly says these are per-host-cluster
infrastructure, NOT Catalyst control-plane components. PLATFORM-TECH-
STACK §3 also classifies them as per-host-cluster.

Fixed: §3 topology diagram now shows only true Catalyst control-plane
components (console, marketplace, admin, catalog-svc, projector,
provisioning, environment-controller, blueprint-controller, billing,
gitea, nats-jetstream, openbao, keycloak, spire-server, observability)
and adds a separate line for "Plus per-host-cluster infrastructure"
that defers to PLATFORM-TECH-STACK §3 for the full list (Cilium, Flux,
Crossplane, cert-manager, ESO, Kyverno, Harbor, Reloader, Trivy, Falco,
Sigstore, Syft+Grype, VPA, KEDA, External-DNS, k8gb, Coraza, MinIO,
Velero, failover-controller). Also added the previously-missing
`provisioning` row.

JetStream Account scoping was contradictory:
- ARCHITECTURE §5 said "Per-Org account: ws.{org}-{env_type}.>" —
  reads ambiguously: is the Account per-Org or per-Env?
- NAMING-CONVENTION §11.2 said "One JetStream Account scoped to
  ws.{org}-{env_type}.>" — implied per-Environment.
- GLOSSARY + PLATFORM-TECH-STACK + SECURITY all say per-Organization.

Reconciled to the per-Org-Account-with-per-Env-subjects model:
- Account isolation: ONE NATS Account per Organization.
- Subjects within the Account use prefix `ws.{org}-{env_type}.>` for
  per-Environment partitioning.

This is the cleanest isolation model: Accounts are NATS' strongest
isolation boundary (per-Org); subjects partition further within each
Account (per-Env).

Refs #37
2026-04-27 21:30:03 +02:00
hatiyildiz
ba048d2fd7 docs(pass-5b): scrub remaining "instance" usages where "Application" is meant
Two user-facing residuals where the banned product term "instance"
slipped through:

- docs/ARCHITECTURE.md §9: example console dialog "Use existing
  instance or create a dedicated one?" → "Use an existing Postgres
  Application or create a new dedicated one?". This is a UI prompt
  text — must use the user-facing noun "Application", not "instance".

- docs/NAMING-CONVENTION.md §6.2 tag comment: "Application instance
  name" → "Application name within the Environment". The CRD might
  internally still use the noun Instance for class-vs-instance
  semantics, but in tag annotations and user-visible context the
  Application IS the instance.

Other "instance" occurrences confirmed legitimate (Postgres instance
as Crossplane resource type, Flux instance as software deployment,
EC2/Hetzner instance as cloud-provider terminology) and retained.

Final cross-reference check: all Markdown links across all canonical
docs resolve. No residual banned terms.

Refs #37
2026-04-27 21:26:27 +02:00
hatiyildiz
80b91709e1 docs(iter-3-5): purge operator-as-entity, fix Workspace-controller capital, JetStream KV references
ARCHITECTURE (iter 3):
- Removed catalystctl from the §4 write-side diagram (it's read-only;
  presenting it as a write input contradicted §7.4).
- "Both tabs read the same Valkey snapshot" → "JetStream KV snapshot"
  in §5 (Valkey is no longer in the control plane).
- §7.4: catalystctl reframed as "may exist as small read-only debug
  CLI" rather than implying it ships today.
- §11 dependency list: added bp-catalyst-provisioning; removed
  bp-catalyst-crossplane (Crossplane is per-host-cluster infra, not a
  Catalyst control-plane component); added clarifying note.
- §12 CRD list: added SecretPolicy + Runbook (were already in
  IMPLEMENTATION-STATUS but missing from the principles table).
- §2 SME-style description: "SaaS Operator team (Omantel staff)" →
  "SaaS provider's cloud team" (Operator banned as entity).

NAMING-CONVENTION (iter 4):
- §5.1 heading "operator domain" → "Sovereign domain".
- §7 multi-region diagram: replaced piecemeal Catalyst component list
  with a deferral to PLATFORM-TECH-STACK §2; added SPIRE server;
  fixed "per-Org workspaces" → "per-Environment Gitea repos"; added
  per-host-cluster infrastructure callout.

SECURITY (iter 6 — partial; fold into this commit):
- "operator-approved" → "sovereign-admin-approved" for DR promotion.
- Realm name "catalyst-operator" → "catalyst-admin" (entity-noun
  scrubbed from the realm naming itself).

SOVEREIGN-PROVISIONING (iter 7 — partial):
- "single operator's laptop" → "single person's laptop" (avoid
  "operator" as entity).
- "the next operator" → "the next Sovereign provisioning request,
  regardless of who initiates it".
- "catalyst-operator realm" → "catalyst-admin realm" (×2).
- Capital-W "Workspace-controller" residuals (3) → "Environment-
  controller" (replace_all is case-sensitive; previous iter caught
  lowercase only).

PERSONAS (iter 5):
- P3 "within a Sovereign Operator team" → "within a Sovereign's
  operations team".
- Two capital-W "Workspace-controller" residuals fixed.

SRE (iter 11 — partial):
- §13.2 "Workspace-controller stuck" runbook entry →
  "Environment-controller stuck".

Banned-term sweep result post-fix: no `Operator team|role|account|
user|admin` anywhere; no capital-W Workspace as Catalyst scope;
no Valkey-as-control-plane refs.

Refs #37
2026-04-27 21:09:31 +02:00
hatiyildiz
27325edb32 docs(iter-2): glossary alignment — rename workspace-controller, fix definitions
GLOSSARY.md line-by-line audit. Eight corrections.

1. workspace-controller → environment-controller everywhere. The
   controller reconciles the Environment CRD; "workspace" is banned as
   a Catalyst scope, so it cannot be in a component name either. Fixed
   in: GLOSSARY, ARCHITECTURE, PLATFORM-TECH-STACK, NAMING-CONVENTION,
   SOVEREIGN-PROVISIONING, IMPLEMENTATION-STATUS, core/README,
   BUSINESS-STRATEGY. Banned-term entry in GLOSSARY now explicitly
   covers component names too.

2. "workspace repos" (per-Environment Gitea repos) → "Environment
   Gitea repos" in GLOSSARY, PLATFORM-TECH-STACK.

3. JWT claim {workspace, org, role} → {environment, org, role} in
   ARCHITECTURE projector diagram.

4. OpenOva definition refined: was "Never used to name a product",
   which contradicted "OpenOva Catalyst", "OpenOva Cortex". Now: brand
   prefix in product names; bare "OpenOva" = the company; bare
   "Catalyst" = the platform.

5. Catalyst definition completed: was missing provisioning, billing,
   gitea, observability — now lists all 14 control-plane components,
   pointing at the table below.

6. Catalyst components table: added `provisioning` (validates
   configSchema, commits to Environment Gitea); reordered to match
   ARCHITECTURE §3 grouping; clarified each component's source-of-truth
   (catalog-svc reads monorepo + Gitea, blueprint-controller watches
   monorepo + Gitea, etc.).

7. Environment definition: refers to NAMING §2.4 for env_type values;
   removed inline list that didn't match canonical ordering. Added
   concrete examples (acme-prod, acme-dev, bankdhofar-uat).

8. Application example: dropped "RocketChat" which appeared nowhere
   else; replaced with generic "running deployment" plus the
   established WordPress / Postgres examples.

9. sovereign-admin description: was "runs Crossplane" — Crossplane is
   platform plumbing not user-facing. Now: "manages the underlying
   clusters via Crossplane (which is platform plumbing, not a
   user-facing surface)".

Banned-term coverage:
- "Workspace" entry now covers BOTH the Catalyst scope AND component
  naming (workspace-controller → environment-controller).

Refs #37
2026-04-27 21:06:09 +02:00
hatiyildiz
217c882916 docs(naming): rename {env}→{env_type}, add Organization + vcluster + Catalyst Environment layers
The naming convention pre-dates vcluster and Catalyst's user-facing
Environment object. Three additions, one rename:

- §2.4: {env} dimension renamed to {env_type} to disambiguate from the
  Catalyst Environment object (which is the user-facing scope, not a
  dimension).

- §2.5: new Organization dimension (slug, lowercase, hyphenated). Used
  for vcluster identity and any Organization-scoped resource.

- §4.7: new vcluster naming layer. Pattern is just {org} within the
  parent host cluster (Don't Repeat the Parent — Principle 1.2). Globally-
  qualified form is {prov}-{reg}-{bb}-{env_type}-{org} for cross-cluster
  references and kubeconfig contexts.

- §11: Catalyst Environment defined as the user-facing {org}-{env_type}
  scope. One Environment is realized by N vclusters across regions × bb
  filtered by Application Placement. Each Environment has its own Gitea
  repo and JetStream Account.

Tags updated: openova.io/environment → openova.io/env-type for
disambiguation; new openova.io/organization, openova.io/vcluster,
openova.io/environment (for Catalyst scope), openova.io/sovereign tags.

DNS pattern §5 split into two: control-plane (component.{location-code}.
{sovereign-domain}) and Application (app.{environment}.{sovereign-or-org-
domain}) — supporting white-label Sovereigns where the Application DNS
uses the customer's own domain.

Refs #37
2026-04-27 20:05:42 +02:00
Emrah Baysal
54b1b4bd3d docs: add unified naming convention and align existing docs
- Add docs/NAMING-CONVENTION.md — canonical naming standard for all
  cloud resources, K8s objects, DNS, and tags across all providers.
  Covers dimension taxonomy (provider/region/building-block/environment),
  the Don't-Repeat-the-Parent principle, 4-char DNS location codes with
  full lookup table, multi-tenant scoping via namespace, and migration rules.

- Fix SRE.md: remove primary/DR region labels; clusters are named by
  building block (rtz/dmz/mgt), not failover role. Both regions run
  symmetric rtz clusters; k8gb owns traffic distribution.

- Fix PLATFORM-TECH-STACK.md: update both Mermaid diagrams and region
  table to use Region A / Region B (rtz cluster) language.

- Fix core/README.md: Platform CRD example now references cluster context
  names (hz-fsn-rtz-prod / hz-hel-rtz-prod) instead of primary/standby roles.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 12:22:52 +01:00