openova/docs/PROVISIONING-PLAN.md
hatiyildiz 224d81e7fe docs(component-count): update 53 → 56 anchors after Pass 105 (spire + nats-jetstream + sealed-secrets)
Closes #137 (and partially #138, #139): platform/ now contains 56 folders
(verified: ls -d platform/*/ | wc -l). Pass 104 set the anchor at 53;
Pass 105 added platform/spire/, platform/nats-jetstream/, and
platform/sealed-secrets/ as G2 wrapper charts for the bootstrap kit
(commit 8c0f766). This brings the count anchor up to date.

Files updated:
- CLAUDE.md L46: '53 folders total' → '56 folders total'
- docs/TECHNOLOGY-FORECAST-2027-2030.md L11: 'all 53 platform components'
  → 'all 56 platform components'
- docs/TECHNOLOGY-FORECAST-2027-2030.md §Mandatory: header (26) → (29);
  added rows for spire, nats-jetstream, sealed-secrets with 2026/2027/2030
  scores + Catalyst-specific notes
- docs/BUSINESS-STRATEGY.md: 26 'bare-53' references → 56 (executive
  summary, principles, comparison tables, expert network, GTM)
- docs/AUDIT-PROCEDURE.md grep #9: anchor expectation 53 → 56; banned-list
  pattern shifted from '52 components' → '53 components' (the now-stale
  count). Deep-read rotation note updated 53 → 56.
- docs/PROVISIONING-PLAN.md: Group K execution-status row reflects the
  refresh; §5 'what doesn't change' clarified that anchor moved 53 → 56.

Verified post-update: grep -rE '\b53 components\b|\b53 platform components\b|\b53 curated\b|\b53-component\b' docs/ README.md CLAUDE.md → empty (excluding VALIDATION-LOG history).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 13:48:24 +02:00

18 KiB
Raw Blame History

Catalyst-Zero Provisioning Plan

Status: Authoritative working plan — execution underway. Updated: 2026-04-28. Owner: OpenOva engineering. Parent issue: #43. Sub-tickets: AM groups, #45#155.


Execution status (live)

Group Tickets Status Commits
A — Code consolidation 9 Done 3c2f7e4
B — SME backend services 10 Source migrated; CI workflow live 7646840
C — Cutover Catalyst-Zero 8 🚧 CI builds live; Flux source repoint pending 9d93912, dc56854, bd967a7, 61de3da, 9fdfe07
D — Wizard 10 🚧 Domain capture + Hetzner project ID added; AppsStep replacement pending 854a063
E — Provisioner backend 13 🚧 Real Hetzner client + bootstrap installer + Dynadot DNS landed; SSH kubeconfig fetch is stub 915c467, db4f21a, 07b4bcf
F — Bootstrap-kit Helm charts 14 All 11 G2 wrapper charts + blueprint-release CI live 8c0f766
G — DNS multi-domain 6 🚧 Dynadot client + manifest env vars done; subdomain-reservation check pending db4f21a
H — Franchise model 7 🚧 docs/FRANCHISE-MODEL.md authored from existing admin impl; cross-Sovereign voucher deferred this commit
I — Wizard UX 6 📐 SSE event log pane + step indicator pending
J — Hetzner infra 6 🚧 cloud-init in repo; firewall + k3s flags wired into provisioner 07b4bcf
K — Documentation 8 🚧 IMPLEMENTATION-STATUS + core/README + products/catalyst/README updated; component-count anchor refreshed 53 → 56 (spire + nats-jetstream + sealed-secrets factored in) 3c2f7e4, 8c0f766, group-k-docs
L — Testing 8 📐 Playwright + integration tests pending
M — End-to-end DoD 9 📐 Awaiting Hetzner credentials from operator + first OCI-artifact CI runs to complete

This document captures the agreed plan for consolidating the existing nova/console/admin/marketplace code into the public OpenOva Catalyst monorepo, deploying it as Catalyst-Zero (the first Catalyst Sovereign — running on Contabo, the chicken in the chicken-and-egg problem), and then provisioning the first franchised Sovereign on Hetzner via the wizard at console.openova.io/sovereign.

This plan is the canonical reference. It supersedes any session-local conversation. Future Claude sessions or new contributors should read this first.


1. The chicken-and-egg problem and its resolution

Catalyst is a Kubernetes-native control plane that provisions other Sovereigns. Provisioning a Sovereign requires a provisioner service (catalyst-provisioner.openova.io per SOVEREIGN-PROVISIONING.md §2). That provisioner has to run somewhere. It cannot run inside the Sovereign it is provisioning (chicken-and-egg).

Resolution: the legacy nova/console/admin/marketplace stack currently running on Contabo k3s (in namespaces catalyst, sme, marketplace, website) is Catalyst-Zero — the first Sovereign. It exists today, has running pods today, and is the chicken from which the egg (the first Hetzner-hosted franchised Sovereign) gets provisioned.

The work in this plan consolidates that existing code into the public repo, redeploys it as a public-repo build (CI from github.com/openova-io/openova), and then uses it to provision the first franchised Sovereign. There is no greenfield "build Catalyst from scratch" — the Sovereign already exists; we are aligning it to the canonical Catalyst contract.


2. Current state inventory (verified against live cluster + repos, 2026-04-28)

2.1 Code locations (today)

What Where today Where it must end up
Catalyst console UI (Astro+Svelte) openova-private/apps/console/ openova/core/console/
Catalyst admin UI (Astro+Svelte) openova-private/apps/admin/ openova/core/admin/
Catalyst marketplace UI (Astro+Svelte) openova-private/apps/marketplace/ openova/core/marketplace/
marketplace-api (Go backend) openova-private/website/marketplace-api/ openova/core/marketplace-api/
Catalyst-zero deployment chart openova-private/clusters/contabo-mkt/apps/catalyst/ openova/products/catalyst/chart/templates/
Vite scaffold for sovereign-wizard openova/products/catalyst/bootstrap/ui/ merges into openova/core/console/src/pages/sovereign/
CI workflows (6 of them) openova-private/.github/workflows/{catalyst-build,marketplace-api-build,sme-{admin,console,marketplace,services}-build}.yaml openova/.github/workflows/
Voucher / billing / tenants admin surface openova-private/apps/admin/src/{components/BillingPage.svelte, lib/api.ts, pages/{billing,catalog,orders,tenants}.astro} openova/core/admin/... (carry forward unchanged)

2.2 Live deployment on Contabo (verified via kubectl get all -A)

Namespace Pods running Notes
catalyst catalyst-api + catalyst-ui 39 days uptime
sme console + admin + marketplace 56 days uptime
marketplace marketplace-api 13 days uptime
website openova-website live

These pods are Catalyst-Zero. They stay running through Phases 12; Phase 2 is a rolling-update cutover to public-repo image builds.

2.3 Existing 5-step wizard (the "Components (5)" page reference)

The "Components (5)" the user referenced is the 5-step marketplace flow at openova-private/apps/marketplace/src/components/:

PlanStep → AppsStep → AddonsStep → CheckoutStep → ReviewStep

AppsStep is what gets replaced with the unified marketplace card grid (driven by the same bp-<x> Blueprint surface every Catalyst Sovereign uses).

2.4 Voucher mechanism (already implemented)

Lives in openova-private/apps/admin/:

  • src/components/BillingPage.svelte — voucher / billing UI
  • src/lib/api.ts — voucher API client
  • src/pages/{billing,catalog,orders,tenants}.astro — admin pages

This is the canonical voucher implementation. Do not redesign. Read what's there, propagate to franchised Sovereigns, document in docs/FRANCHISE-MODEL.md.


3. Architectural agreements (from the design conversation, durable)

These agreements survive any context compaction and apply to every phase of the work below.

  1. Catalyst-Zero is the existing Contabo deployment. Not greenfield. The work is consolidate + cutover + extend, not rebuild.
  2. omani.works is the first Sovereign-provided subdomain pool (registered to the OpenOva Dynadot account). User dynamically picks omantel.omani.works during provisioning. The wizard offers BYO domain (customer's own) or a Sovereign-pool subdomain (default). Multi-region setups are out of scope for the first run.
  3. Existing admin voucher implementation is the source of truth. Do not propose new CRDs. Read the existing implementation, propagate it to franchised Sovereigns, document it.
  4. G2 quality only. Catalyst-curated wrapper Helm charts at platform/<x>/chart/ for every component in the bootstrap kit. No upstream-as-is shortcuts. No corner-cutting. The unified Blueprint contract from BLUEPRINT-AUTHORING.md §1 is the standard.
  5. No mocks. No iterations. No partial deliveries. Waterfall — every phase produces real, deployed, working artifacts.
  6. All product code is public. Per the build-minutes constraint, code moves to openova/ (the public monorepo) before any further development. CI runs in the public repo from this point onward.
  7. The Vite scaffold at products/catalyst/bootstrap/ui/ merges into core/console/src/pages/sovereign/. It does not become its own deployable.
  8. Sovereign-provisioning wizard target URL: console.openova.io/sovereign. Captured fields include domain (BYO or pool), Hetzner Cloud API token, Hetzner project ID, Hetzner region (runtime parameter, never hardcoded), plus the marketplace-style App selection.
  9. The Hetzner region is a runtime parameter chosen by the wizard user. Never hardcoded anywhere in code.
  10. Dynadot covers all OpenOva domains. The dynadot-api-credentials K8s secret in openova-system is account-scoped and covers openova.io plus omani.works (and any other domain in the same Dynadot account). The secret's domain field can be extended to a list as needed; the API key/secret remain the same.

4. The 8-phase waterfall

Each phase produces one or more commits to openova/. Each commit is real working code, not scaffold. No phase is skipped, abbreviated, or deferred.

Phase 1 — Code consolidation (openova-private → openova)

What: git mv the 4 apps (console, admin, marketplace, marketplace-api) from openova-private to openova/core/. Move 6 CI workflows to openova/.github/workflows/. Move Catalyst-Zero deployment manifests from openova-private/clusters/contabo-mkt/apps/catalyst/ to openova/products/catalyst/chart/templates/.

Outputs:

  • openova/core/{console,admin,marketplace,marketplace-api}/ populated
  • openova/.github/workflows/{catalyst-build,marketplace-api-build,sme-*-build}.yaml
  • openova/products/catalyst/chart/templates/{api-deployment,api-service,ui-deployment,ui-service,ingress}.yaml
  • openova/products/catalyst/chart/Chart.yaml (new)
  • All import paths, image refs (ghcr.io/openova-io/openova/{console,admin,marketplace,marketplace-api,catalyst-api,catalyst-ui}:<sha>) updated
  • VALIDATION-LOG entry: Pass 105

Commit message: feat(consolidation): move Catalyst-Zero apps + CI from openova-private to public monorepo

Phase 2 — Cutover Catalyst-Zero to public-repo build

What: Trigger first public-repo CI run, get :<sha> images into GHCR, roll the existing Contabo deployment to the new images. Catalyst-Zero is now built from the public repo. Delete legacy paths from openova-private (preserved in git history).

Outputs:

  • GHCR images at ghcr.io/openova-io/openova/{console,admin,marketplace,marketplace-api,catalyst-api,catalyst-ui}:<sha>
  • Contabo k3s pods rolled to new image SHAs
  • openova-private cleaned of legacy paths
  • VALIDATION-LOG entry: Pass 106

Commit message: infra(cutover): Catalyst-Zero now built from public repo

Acceptance: kubectl describe pod on each rolled pod shows image: ghcr.io/openova-io/openova/.... Console at console.openova.io still loads. Brief rolling-update window (<60s).

Phase 3 — Sovereign-provisioning wizard

What: Build the wizard at core/console/src/pages/sovereign/ using the Vite scaffold. Replace the legacy 5-step marketplace flow's AppsStep with a unified marketplace card grid (driven by bp-<x> Blueprint surface). Add Sovereign-provisioning-specific fields:

  • Domain: BYO (customer's own domain) or pool selection (default omani.works → user picks subdomain like omantel, acme-bank, etc.)
  • Hetzner Cloud API token (capture, store via ESO into OpenBao, never log)
  • Hetzner project ID
  • Hetzner region (dropdown of valid Hetzner regions; runtime parameter)
  • Sovereign owner email (becomes initial sovereign-admin)
  • Initial App selection (the unified marketplace grid)

Outputs:

  • openova/core/console/src/pages/sovereign/index.astro + sub-pages for each wizard step
  • openova/core/console/src/components/sovereign/{DomainStep,HetznerStep,AppsStep-unified,ReviewStep}.svelte
  • The legacy bootstrap Vite scaffold at openova/products/catalyst/bootstrap/ui/ is merged in and the directory deleted (its content is now part of core/console/)
  • VALIDATION-LOG entry: Pass 107

Commit message: feat(console): sovereign-provisioning wizard at /sovereign with domain + Hetzner inputs + unified marketplace App selection

Phase 4 — Provisioner backend

What: Extend core/marketplace-api/provisioner/ (existing Go module) with Hetzner Cloud API client + OpenTofu invocation + bootstrap-kit installer. Real backend that takes wizard input → calls Hetzner → returns Sovereign provisioning state via SSE.

Outputs:

  • core/marketplace-api/provisioner/hetzner.go — Hetzner Cloud API client (servers, networks, load balancers, firewalls, DNS pointed at Dynadot)
  • core/marketplace-api/provisioner/opentofu.go — invokes OpenTofu modules from infra/hetzner/
  • core/marketplace-api/provisioner/bootstrap.go — orchestrates the 11-component bootstrap kit (Cilium → cert-manager → Flux → Crossplane → Sealed Secrets → SPIRE → JetStream → OpenBao → Keycloak → Gitea → bp-catalyst-platform), in dependency order
  • core/marketplace-api/handlers/provisioner.go — REST endpoints POST /v1/sovereigns, GET /v1/sovereigns/{id}/events (SSE)
  • infra/hetzner/main.tf — OpenTofu module
  • VALIDATION-LOG entry: Pass 108

Commit message: feat(provisioner): real Hetzner Sovereign provisioning end-to-end

Phase 5 — Bootstrap kit Helm charts (G2 quality)

What: Real Catalyst-curated wrapper Helm charts at platform/<x>/chart/ for every bootstrap-kit component. Each chart wraps upstream OSS with Catalyst-specific values, includes a blueprint.yaml per the unified Blueprint contract from BLUEPRINT-AUTHORING.md §1, publishes a bp-<name>:<semver> OCI artifact via CI fan-out.

Components (in dependency order):

  1. platform/cilium/chart/ (CNI must come first)
  2. platform/cert-manager/chart/
  3. platform/flux/chart/ (host-level)
  4. platform/crossplane/chart/
  5. platform/sealed-secrets/chart/ (transient bootstrap-only)
  6. platform/spire/chart/ (the platform/spire/ folder may need to be added — workload identity)
  7. platform/nats-jetstream/chart/ (the platform/nats-jetstream/ folder may need to be added)
  8. platform/openbao/chart/
  9. platform/keycloak/chart/
  10. platform/gitea/chart/
  11. products/catalyst/chart/ — the umbrella bp-catalyst-platform

Outputs:

  • 11 directories with Chart.yaml, values.yaml, templates/, blueprint.yaml, optional compositions/, policies/, overlays/
  • 11 entries in openova/.github/workflows/blueprint-release.yaml (path-matrix CI fan-out)
  • 11 OCI artifacts published at ghcr.io/openova-io/bp-<name>:<semver> after first CI run
  • One commit per chart (11 commits) — incremental review possible
  • VALIDATION-LOG entries: Pass 109 through Pass 119

Commit messages: feat(bp-<name>): G2 Catalyst-curated chart for <name> per BLUEPRINT-AUTHORING contract

Phase 6 — Dynadot extension for omani.works

What: Extend the K8s secret dynadot-api-credentials (in namespace openova-system) so its domain field is a list including both openova.io and omani.works. Add a Crossplane Composition or external-dns config so the provisioner can programmatically write A records *.omantel.omani.works → new Hetzner LB IP.

Outputs:

  • Updated K8s secret (apply via sealed-secrets, not raw kubectl) — the secret update goes through Flux GitOps just like everything else
  • platform/external-dns/policies/dynadot-multi-domain.yaml — config that handles both domains
  • core/marketplace-api/provisioner/dns.go — Dynadot API client extended with multi-domain support
  • VALIDATION-LOG entry: Pass 120

Commit message: feat(dns): Dynadot multi-domain support for omani.works as Sovereign-pool subdomain

Phase 7 — Franchise model docs + voucher propagation

What: Read existing voucher implementation in admin app. Write docs/FRANCHISE-MODEL.md documenting it as canonical. Ensure the new Sovereign at omantel.omani.works has its own admin surface (the same admin app, deployed inside the Sovereign) where omantel-admin can issue vouchers to omantel's tenants. Update GLOSSARY.md with Voucher and Franchisee definitions if not already present.

Outputs:

  • openova/docs/FRANCHISE-MODEL.md — canonical doc
  • Updates to GLOSSARY.md if needed
  • Updates to BUSINESS-STRATEGY.md revenue model if needed
  • VALIDATION-LOG entry: Pass 121

Commit message: docs(franchise): canonical franchise model + voucher propagation, sourced from existing admin impl

Phase 8 — End-to-end provisioning (live demo / DoD)

What: From browser at console.openova.io/sovereign:

  1. User logs in (Keycloak SSO)
  2. Picks "New Sovereign"
  3. Pastes Hetzner Cloud API token + project ID, picks region (any — runtime parameter)
  4. Picks domain: pool → omani.works → user types omantel (creates omantel.omani.works)
  5. Picks initial Apps (unified marketplace selection)
  6. Click Provision
  7. Watches SSE-driven progress for ~10 minutes
  8. Provisioning completes; new Sovereign at omantel.omani.works is reachable
  9. omantel-admin (initial sovereign-admin) logs into console.omantel.omani.works
  10. omantel-admin issues 1 voucher
  11. A fictional customer redeems the voucher at omantel.omani.works/redeem?code=...
  12. Customer's Organization + Environment + first App is created on omantel.omani.works
  13. Customer reaches their App's URL

Acceptance: every step above works without intervention. No mocks, no manual steps beyond the browser clicks.

Outputs:

  • VALIDATION-LOG entry: Pass 122 — DoD documented with screenshots / kubectl evidence
  • Optional: docs/DEMO-RUNBOOK.md for repeatability

5. What this plan does NOT change

  • The unified Application = Gitea Repo model (Pass 103) is preserved everywhere. The franchised Sovereign at omantel.omani.works will use the same model — one Gitea Org per Catalyst Organization, one Gitea Repo per Application.
  • The 5 conventional Gitea Orgs convention (catalog, catalog-sovereign, <org> per Catalyst Organization, system) applies to the new Sovereign exactly as it does to Catalyst-Zero.
  • The component-count anchor (Pass 104 set 53; Pass 105 raised it to 56 with spire + nats-jetstream + sealed-secrets) holds. SeaweedFS unified S3 encapsulation, Guacamole in bp-relay, OpenBao independent-Raft per region — all preserved.
  • The audit procedure stays on-demand (no scheduled loops). The audit-catalyst-docs skill is the only validation entry point.

6. References

  • docs/ARCHITECTURE.md — target architecture (the design Catalyst-Zero is being aligned to)
  • docs/SOVEREIGN-PROVISIONING.md §3 Phase 0 — bootstrap kit dependency order (canonical reference for Phase 5 of this plan)
  • docs/BLUEPRINT-AUTHORING.md §1 — unified Blueprint shape (the contract Phase 5 charts must satisfy)
  • docs/IMPLEMENTATION-STATUS.md — gets updated incrementally as each phase lands (📐🚧)
  • docs/AUDIT-PROCEDURE.md — how to validate after each phase
  • docs/VALIDATION-LOG.md Pass 1104 — historical record; Pass 105+ tracks this plan's execution

Part of OpenOva