Commit Graph

392 Commits

Author SHA1 Message Date
hatiyildiz
b96a03a585 feat(wizard): worker SKU + count selector in topology step
Closes the wizard polish gap "Selecting the shapes of the worker
nodes should be there." StepInfrastructure had a worker SKU + count
selector but was never wired into WizardPage.STEPS — the user walks
through StepTopology, where no sizing controls existed.

Adds a NodeSizingPanel inside StepTopology that:
  • Renders control-plane and worker SKU cards from
    HETZNER_NODE_SIZES (single source of truth — no SKU duplication).
  • Exposes a worker-count stepper and editable spinbutton, clamped
    to the topology-aware floor (0 for solo, 3 for multi-region) and
    a ceiling of 6 to stay inside Hetzner's default project quota.
  • Shows the worker SKU grid only when count > 0.
  • Surfaces a hard validation error when count > 0 but workerSize
    is unset; gates the Topology step's Continue button on the same.

Updates the store's setTopology to seed the worker-count default at
topology-pick time (solo → 0, multi-region → max(current, 3)) so
users land on a sensible default and the existing partialize() rules
keep persisting controlPlaneSize / workerSize / workerCount across
sessions unchanged.

StepReview now renders three chips inside the Infrastructure section
(control plane, workers, compute-total cost rollup) so the SKU + count
choice is visible at launch time, alongside the per-region cards
that were already there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:25:48 +02:00
hatiyildiz
27527e4ca5 fix(catalyst-api): pin TOFU_WORKDIR to writable /tmp + raise cpu/mem caps
Launch failed instantly with "create workdir: mkdir /var/lib/catalyst:
permission denied". The catalyst-api Pod runs as UID 65534 with emptyDir
mounts only at /tmp and /home/nonroot — /var/lib was never writable, so
the provisioner.New() default for CATALYST_TOFU_WORKDIR
(/var/lib/catalyst/tofu) lost on the very first MkdirAll call.

Three coupled fixes:

- Set CATALYST_TOFU_WORKDIR=/tmp/catalyst/tofu so the per-deployment
  workdir tree lands in the existing /tmp emptyDir.
- Bump cpu limit 100m → 1000m, memory limit 64Mi → 1Gi. tofu init pulls
  ~80MB hcloud + ~30MB dynadot provider plugins; tofu plan/apply hold
  the state file in memory; 64Mi was always going to OOM on first init.
- Grow /tmp emptyDir sizeLimit 256Mi → 2Gi to fit the per-Sovereign
  subdirectory tree (provider binaries + state + plan output).

Manifest-only change — Flux reconciles, kubectl rollout swaps the Pod,
no image rebuild required.
2026-04-29 10:12:44 +02:00
github-actions[bot]
f74e2816f1 deploy: update catalyst images to beefe02 2026-04-29 07:45:25 +00:00
hatiyildiz
beefe0262a merge: #175 — product-family dependency model + transitive-mandatory promotion
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:44:11 +02:00
hatiyildiz
0887073735 feat(wizard): #175 — product-family dependency model + transitive-mandatory promotion
Two interlocking fixes for StepComponents per operator feedback (#175):

1. **Transitive-mandatory promotion** (Fix A) — at module-load time walk
   the dependency graph from every mandatory-tier component and promote
   every reached component to mandatory. cnpg + valkey are lifted from
   recommended → mandatory because Harbor / Gitea / PowerDNS / Keycloak
   (mandatory or transitively mandatory) cannot run without them. They
   no longer surface in Tab 1 ("Choose Your Stack"); they appear in Tab 2
   ("Always Included") under the FABRIC product section.

2. **Product-family model** (Fix B) — new `Product` type in
   `componentGroups.ts` with `tier`, `components`, `familyDependencies`,
   and `cascadeOnMemberSelection`. CORTEX is flagged as
   cascade-on-member-selection (operator: "BGE alone doesn't have much
   meaning unless we have Cortex... when chosen the entire family needs
   to be selected"). Selecting any CORTEX member or Specter (whose deps
   reach into CORTEX) cascades the rest of CORTEX plus FABRIC (CORTEX's
   familyDependency). À-la-carte products (FABRIC, RELAY) keep
   independent member selection.

UX additions:
- Product header per family in Tab 1 with "Select entire X family" CTA
  (selectable via product-cta-<id> testid)
- Cascade-add toast surfaces both component-deps and family additions
- Cascade-remove confirmation modal lists every dependent that will go
- All operator-visible strings sourced from new
  `stepComponentsCopy.ts` i18n module — no inline literals in JSX

Store actions: `addProduct(id)` / `removeProduct(id)` plus a
member-selection cascade in `addComponent` that respects the product
flag. Mandatory components are protected from any cascade-remove path.

Documentation: `docs/PRODUCT-FAMILIES.md` describes the dependency
model, every product entry, and worked examples (Specter, BGE, Harbor,
ClickHouse).

Vitest: 43 new test cases including transitive-promotion verification,
cross-product cascade, product CTA flow, and i18n wiring. All 146
tests pass; typecheck + build green.

Closes #175.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:43:00 +02:00
hatiyildiz
04559e5c37 docs(reconcile-pass-1): align docs with ground truth at dd578d1c
Reconcile Pass 1 — first holistic LLM-driven reconciliation pass per
~/.claude/skills/reconcile-catalyst-docs/SKILL.md. Skill triggered after
the post-Group-M architectural batch (#161, #162, #163, #167, #168,
#169, #170, #171, #173, #174, #175). Live ground truth verified against
kubectl + ls platform/ + git log + GHCR + componentGroups.ts.

Drift categories fixed:

- A. Numerical: bp-powerdns 1.0.5 → 1.0.6; component-logos 63 → 62
  (powerdns SVG missing, tracked under #173); bootstrap kit 11 → 12
  with bp-powerdns added per #167.
- B. Service: pool-domain-manager + 5 registrar adapters
  (Cloudflare/Namecheap/GoDaddy/OVH/Dynadot, #170) added to
  IMPLEMENTATION-STATUS, ARCHITECTURE, PLATFORM-TECH-STACK, GLOSSARY,
  and PROVISIONING-PLAN; bp-powerdns added to ARCHITECTURE bootstrap
  kit + Catalyst-on-Catalyst dependency tree.
- C. Architectural: SOVEREIGN-PROVISIONING §3 + DEMO-RUNBOOK Step 4
  + ORCHESTRATOR-STATE Step 6 rewritten from Dynadot-direct DNS writes
  to PowerDNS authoritative + PDM /v1/commit + registrar-adapter
  NS-flip; PROVISIONING-PLAN Phase 4 paths corrected to
  products/catalyst/bootstrap/api/ (per INVIOLABLE-PRINCIPLES #3 the
  Go provisioner does NOT call cloud APIs); Phase 6 retitled and
  rewritten for the new DNS architecture.
- D. Process: RUNBOOK-PROVISIONING §2 wizard-step table + DEMO-RUNBOOK
  Step 2 wizard-step table updated to canonical 7-step ordering
  (Org → Domain → Topology → Provider → Credentials → Components →
  Review per WIZARD_STEPS in WizardLayout.tsx, post #169 + #174); the
  three-mode StepDomain (pool / byo-manual / byo-api per #169) and
  two-tab StepComponents (mandatory infra + apps per #161/#162/#175)
  now documented.
- E. Cross-doc: Group G  across PROVISIONING-PLAN +
  ORCHESTRATOR-STATE (superseded by #167+#163+#170, not by the
  original Dynadot-multi-domain plan); Group C  in
  PROVISIONING-PLAN (Flux is reconciling from openova-public today);
  README Stack-at-a-glance DNS row expanded.
- F. Stale terminology: 11-grep banned-terms scan clean — every k8gb
  residual is a legitimate "removed at #171, replaced by lua-records"
  reference.

VALIDATION-LOG.md gains the Reconcile Pass 1 entry per skill spec.
Reconcile-skill numbering is independent of the Audit-skill numbering
(which continues at Pass 108+).

Files: 13 docs + VALIDATION-LOG entry.
Escalations: none.
2026-04-29 09:40:10 +02:00
github-actions[bot]
c83171805c deploy: update catalyst images to 2e6cfd7 2026-04-29 07:20:21 +00:00
hatiyildiz
2e6cfd79c3 merge: #173 — fix wizard component-card logos under /sovereign/ base
Squashed-via-no-ff: #173 root-cause fix for absolute logo URLs that
ignored the Vite base. componentGroups.ts now derives every logo path
from `import.meta.env.BASE_URL` via `path()` so the URL stays in sync
with vite.config.ts. Adds CI smoke step that curls the logos to fail
the build on any missing/mis-cased SVG, plus Vitest coverage for the
letter-mark fallback path.
2026-04-29 09:19:18 +02:00
hatiyildiz
d382d99e45 fix(catalyst-ui): #173 — wizard component logos render under /sovereign/ base
Root cause: componentGroups.ts hardcoded `/component-logos/<id>.svg`. The
catalyst-ui SPA is served at the Vite base `/sovereign/`, so the browser
fetches `/component-logos/...` (no prefix), which Traefik routes to the
website ingress, not catalyst-ui — every logo 404'd and the IconFallback
letter avatar took over for all 63 cards.

Fix: derive logo URLs from `path()` in shared/config/urls.ts, which reads
`import.meta.env.BASE_URL`. Vite injects the base at build time
(`/sovereign/` in prod, `/` in dev/test) so the URL stays in sync with
`vite.config.ts` and the ingress without any hardcoded prefix
(INVIOLABLE PRINCIPLE #4).

Also:
- powerdns.svg was never vendored — set logoUrl: null so the wizard
  renders the letter-mark fallback for that one card by design.
- Add Vitest coverage for the null-logoUrl fallback path (PowerDNS).
- Add CI smoke step that asserts /component-logos/<id>.svg returns 200
  for 11 representative components so a missing or mis-cased vendored
  SVG fails the build, not the user.
- Document the logo path convention in a docblock at the top of
  componentGroups.ts so future devs can't reintroduce a hardcoded path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:18:50 +02:00
github-actions[bot]
dd578d1c13 deploy: update catalyst images to 7a5e5db 2026-04-29 07:16:20 +00:00
Emrah Baysal
7a5e5db9ba merge: hoist wizard step indicator into page header (#174)
Brings fix/wizard-step-header to main. Wizard's 7-step progress
indicator now lives in a single 56px page-header band (alongside the
OpenOva brand mark + theme/exit actions), matching the nova/core console
chrome convention. Step body reclaims the vertical real estate. New
vitest suite asserts the header layout, the 7 step indicators, the
active-step class, and the mobile fallback.

Closes #174.
2026-04-29 09:15:06 +02:00
hatiyildiz
dbf37e1ba5 fix(catalyst-wizard): hoist 7-step indicator into page header (#174)
The wizard's progress stepper used to live inside `.corp-main` (the step
body region). Nova / core console renders all chrome in a top header
band, so the wizard now does the same:

  - Single 56px-tall sticky `<header data-testid="wizard-header">` band
    hosts brand mark + 7-step indicator + theme/exit actions
  - Step indicator carries `data-testid="wizard-stepper"` and exposes
    `wizard-step-{1..7}` testids; the active step gets `.active` and
    `aria-current="step"`, completed steps get `.done`
  - At ≤1024px the per-step labels collapse, ≤720px the dotted indicator
    hides and a "Step X of Y · <Label>" string takes over
  - All dimensions/colors come from the wizard's `--wiz-*` token set
    (per Inviolable Principle #4 — never hardcode); the 56px height
    matches nova's Sidebar.svelte logo row (`h-14`) and the border-bottom
    uses the shared `--wiz-border` token

Vitest covers: header presence, brand mark, exactly 7 step indicators,
active/done class application, mobile-collapsed indicator, and the
absence of a duplicate stepper inside the step body.

Closes #174.
2026-04-29 09:14:37 +02:00
github-actions[bot]
40805334a8 deploy: update catalyst images to 194b0ee 2026-04-29 07:04:00 +00:00
e3mrah
194b0ee413
Merge pull request #172 from openova-io/feat/wizard-byo-domain
feat(wizard): #169 — StepDomain three-mode (pool / byo-manual / byo-api)
2026-04-29 11:02:21 +04:00
hatiyildiz
20f5dca902 feat(wizard): #169 — StepDomain three-mode (pool / byo-manual / byo-api)
Closes openova#169.

Wizard UI:
- New StepDomain.tsx with three radio modes (pool / BYO manual NS / BYO
  registrar API). Pool flow unchanged from #163. BYO-manual surfaces the
  three OpenOva nameservers (ns1-3.openova.io) verbatim with copy buttons.
  BYO-api adds a registrar dropdown (Cloudflare, Namecheap, GoDaddy, OVH,
  Dynadot) + token field + Validate button — read-only validation hits
  /api/v1/registrar/{r}/validate before Next is enabled.
- StepOrg trimmed to org-only fields (domain capture moved to StepDomain).
- WizardPage + WizardLayout add the new "Domain" step (now 7 steps total).

Wizard store:
- DomainMode expanded to 'pool' | 'byo-manual' | 'byo-api' with legacy
  'byo' coerced to 'byo-manual' on rehydrate.
- New fields: registrarType (RegistrarType | null), registrarToken,
  registrarTokenValidated.
- partialize() strips registrarToken + registrarTokenValidated from
  localStorage (credential hygiene per docs/INVIOLABLE-PRINCIPLES.md #10).
- setSovereignDomainMode cascades a clean reset of irrelevant fields.

PDM (core/pool-domain-manager):
- New endpoint POST /api/v1/registrar/{registrar}/validate — read-only
  twin of /set-ns. Calls adapter.ValidateToken; never flips NS records.
  Maps registrar errors to canonical HTTP statuses (401/403/429/502).
  Token never enters a logged struct.

catalyst-api (products/catalyst/bootstrap/api):
- New handler/registrar.go — thin proxy that forwards
  /api/v1/registrar/{r}/{validate|set-ns} to PDM's matching endpoint,
  reading the body once and streaming PDM's response status + body
  verbatim so the wizard's error-mapping vocabulary stays consistent.

Tests:
- StepDomain.test.tsx — 18 vitest cases covering all three modes,
  mode-switch field cleanup, validate fetch happy/error paths, token
  invalidation on edit.
- store.test.ts — wizard-store mutations + persist hygiene.
- StepSuccess.test.tsx — fixture updated 'byo' -> 'byo-manual'.
- registrar_test.go (PDM) — 7 new test cases for /validate covering
  happy, invalid-token, domain-not-in-account, unsupported-registrar,
  missing-fields, bad-JSON, response-doesnt-leak-token.

103 vitest cases pass. Go tests pass for both PDM and catalyst-api.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 09:01:07 +02:00
github-actions[bot]
f06b6d9ce7 deploy: update catalyst images to 67fdecb 2026-04-29 06:52:27 +00:00
hatiyildiz
67fdecb770 merge: remove k8gb (#171) 2026-04-29 08:51:21 +02:00
hatiyildiz
f5daac52af refactor(platform): remove k8gb — replaced by PowerDNS lua-records (#171)
PowerDNS lua-records (`ifurlup`, `pickclosest`, `ifportup`) cover everything
k8gb was doing — geo-aware response selection, health-checked failover,
weighted round-robin — at the authoritative DNS layer. Eliminates a
separate K8s controller, CRD set, and CoreDNS plugin from every Sovereign.

Changes:
- platform/k8gb/ deleted (Chart.yaml, values.yaml, blueprint.yaml never
  authored — only README existed)
- products/catalyst/bootstrap/ui/public/component-logos/k8gb.svg deleted
- componentGroups.ts: remove k8gb component (PowerDNS already there)
- componentLogos.tsx: drop logo_k8gb + k8gb map entry
- model.ts DEFAULT_COMPONENT_GROUPS spine: replace k8gb with powerdns
- StepInfrastructure.tsx: copy refers to PowerDNS lua-records, not k8gb
- provision.html: replace k8gb tile and edges with powerdns
- catalog.generated.ts regenerated (now includes bp-powerdns)
- docs sweep — every k8gb reference in PLATFORM-TECH-STACK, NAMING-
  CONVENTION, SOVEREIGN-PROVISIONING, SRE, ARCHITECTURE, GLOSSARY,
  COMPONENT-LOGOS, IMPLEMENTATION-STATUS, BUSINESS-STRATEGY,
  TECHNOLOGY-FORECAST, README, infra/hetzner/README, platform READMEs
  (cilium, external-dns, failover-controller, litmus, flux, opentofu)
  rewritten to point at PowerDNS lua-records / MULTI-REGION-DNS.md.
  Historical entries in VALIDATION-LOG.md preserved as audit trail.
- New docs/MULTI-REGION-DNS.md — canonical reference for the lua-record
  patterns (ifurlup all/pickclosest/pickfirst, ifportup, pickwhashed),
  Application Placement → lua-record selector mapping, when to add a
  second Sovereign region, operational checks.

Closes #171.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:51:09 +02:00
hatiyildiz
6e9b9fe8a3 merge: bp-powerdns 1.0.6 — gpgsql-dnssec=yes (openova#168 followup)
Fixes the 422 "no DNSSEC-capable backends loaded" surfaced when PDM
tried to enable DNSSEC on parent pool zones at startup.
2026-04-29 08:42:27 +02:00
hatiyildiz
f4679e2748 fix(powerdns): enable gpgsql-dnssec for DNSSEC API (1.0.6)
Without `gpgsql-dnssec=yes` the gpgsql backend driver does not expose
the DNSSEC API surface — `PUT /zones/<zone>` with `dnssec:true` returns
422 "no DNSSEC-capable backends are loaded". This blocks pool-domain-
manager from enabling DNSSEC on every Sovereign child zone (mandatory
per docs/PLATFORM-POWERDNS.md).

Fix lands in additionalConfig so the directive is rendered alongside
`default-soa-edit-signed=INCEPTION-EPOCH` and `direct-dnskey=yes`. No
schema migration needed — the gpgsql 5.0.3 schema already includes the
cryptokeys table; the missing piece was just the backend feature flag.

Bumps Chart.yaml to 1.0.6. Verified: after this lands the PUT call
returns 204 and POST /cryptokeys mints a usable KSK.

Discovered while bringing up openova#168 (PDM per-Sovereign zones).
2026-04-29 08:42:18 +02:00
hatiyildiz
f777394367 merge: PDM per-Sovereign PowerDNS zones (openova#168)
PDM /reserve now creates a per-Sovereign child zone in PowerDNS with
apex NS RRset + adds NS delegation into the parent pool zone +
enables DNSSEC. /commit writes the canonical 6-record set into the
child zone (atomic PATCH). /release drops the child zone and removes
the parent NS delegation.

Includes pdns client (22 tests), allocator with DNSWriter interface
(fake-DNS state-machine tests), startup parent-zone bootstrap, and
trimmed dynadot package (now config helpers only — registrar adapter
under internal/registrar/dynadot/ untouched for #170 BYO Flow B).
2026-04-29 08:37:07 +02:00
hatiyildiz
a6fb7410f4 feat(pdm): per-Sovereign PowerDNS zones for #168
Refactor pool-domain-manager to own per-Sovereign zones in PowerDNS,
replacing the previous Dynadot-set_dns2 record-write flow.

Phase 1 — internal/pdns: REST client for PowerDNS Authoritative API
  - CreateZone / DeleteZone / EnsureZone / ZoneExists
  - PatchRRSets (atomic batch RRset writes)
  - AddARecord / AddNSDelegation / RemoveNSDelegation
  - EnableDNSSEC: PUT dnssec flag, generate KSK+ZSK (algorithm 13
    ECDSAP256SHA256 per docs/PLATFORM-POWERDNS.md), POST rectify
  - retry-once-on-5xx with exponential backoff (250ms, 1s)
  - X-API-Key header from K8s Secret, never logged
  - 22 unit tests covering every method against httptest mock

Phase 2 — allocator: DNSWriter interface + per-Sovereign lifecycle
  - /reserve: insert pdm-pg row + create child zone with apex NS
    RRset + add NS delegation into parent + enable DNSSEC on child
  - /commit: write the canonical 6-record set (apex, *, console,
    api, gitea, harbor) into child zone, TTL 300, atomic PATCH
  - /release: drop child zone (DNSSEC keys retire) + remove parent
    NS delegation, idempotent on 404
  - sweeper teardowns DNS for expired reservations before deleting
    pdm-pg rows
  - rollback path on Reserve failure preserves operator UX
  - allocator_test.go: fake DNSWriter for state-machine assertions

Phase 3 — startup parent-zone bootstrap
  - BootstrapParentZones runs at PDM startup before HTTP serves
  - EnsureZone for every entry in DYNADOT_MANAGED_DOMAINS
  - DNSSEC enabled on each parent zone (idempotent)
  - PDM exits non-zero if bootstrap fails

Phase 4 — schema unchanged
  - child zone name derived as <subdomain>.<poolDomain>, no new column
  - existing pool_allocations table works as-is

Phase 5 — dynadot package trimmed
  - removed AddSovereignRecords / DeleteSubdomainRecords / AddRecord /
    getZone / writeZone (Dynadot DNS write code)
  - kept IsManagedDomain / ManagedDomains / ResetManagedDomains /
    ErrUnmanagedDomain (config-resolution helpers)
  - registrar adapter at internal/registrar/dynadot/ untouched (handles
    BYO Flow B NS-delegation via #170)

Phase 6 — env-var contract
  PDM_PDNS_BASE_URL, PDM_PDNS_API_KEY, PDM_PDNS_SERVER_ID, PDM_NAMESERVERS
  all runtime-configurable per docs/INVIOLABLE-PRINCIPLES.md #4.

Quality bar (all met):
  - DNSSEC enabled on every child zone (mandatory per spec)
  - parent NS delegation TTL 3600, child A-record TTL 300
  - retry-once-on-5xx with exponential backoff in pdns client
  - all credentials flow from env vars sourced from K8s Secrets
  - no hardcoded URLs, regions, or NS endpoints

Closes openova#168 (DNS-side; private-repo manifest update lands separately).
2026-04-29 08:36:45 +02:00
hatiyildiz
22d430eaa8 merge: bp-powerdns 1.0.5 (postInitSQL syntax fix, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:17:34 +02:00
hatiyildiz
fa84cac438 fix(powerdns): plain ALTER TABLE in postInitSQL (avoid $$ escape battle, 1.0.5)
The DO block in 1.0.4 rendered with $$ collapsed to $ by the time it
reached CNPG's postInitApplicationSQL — "syntax error at or near $".
Both Helm template processing and the YAML scalar block were chewing on
the dollar signs.

Replaced with explicit ALTER TABLE statements (one per gpgsql table) +
GRANT — same end state, no PL/pgSQL quoting required. Verified at
runtime on contabo-mkt: powerdns Pod went CrashLoopBackOff →
Running 1/1 immediately after the manual ALTER ran by hand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:17:28 +02:00
hatiyildiz
30f3015dc8 merge: bp-powerdns 1.0.4 (CNPG ownership fix, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:14:18 +02:00
hatiyildiz
214a3e1ada fix(powerdns): grant table ownership to pdns user in CNPG bootstrap (1.0.4)
Verified at runtime on Contabo-mkt: postInitApplicationSQL runs as the
postgres superuser, not the application owner, so the schema tables
created by the bootstrap block were owned by postgres. PowerDNS connects
as 'pdns' and got 'permission denied for table domains' on the first
SELECT against the zone cache.

Added a DO block at the end of the schema bootstrap that walks every
table in the public schema and ALTERs OWNER TO {{ .Values.postgres.cluster.owner }}
plus GRANT ALL PRIVILEGES ON SCHEMA public — same shape PDM uses (and
the contabo-mkt cluster verified the fix runtime: powerdns Pod went
from CrashLoopBackOff to 1/1 Ready immediately after the same DDL was
run by hand).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:14:12 +02:00
hatiyildiz
036dc39800 merge: bp-powerdns 1.0.3 (dnsdist backend env-injection + table ownership, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:13:54 +02:00
hatiyildiz
db20e9d42b fix(powerdns): dnsdist backend resolution + drop DnstapLogAction (1.0.3)
dnsdist 1.9.14 runtime errors:
  1. newServer{address='powerdns:5353'} → "Unable to convert presentation
     address" — dnsdist's address parser expects IP[:port], not a DNS
     name. Kubernetes auto-injects POWERDNS_SERVICE_HOST as an env var
     into every pod in the same namespace as the powerdns Service; using
     that gives us the ClusterIP at config-load time without needing an
     init container or runtime DNS resolution.
  2. DnstapLogAction(name, bool, fn) signature changed in 1.9 — the
     2nd parameter now expects a shared_ptr to a RemoteLoggerInterface,
     not a boolean. Rather than wire up a remote dnstap server (which
     adds a moving part for marginal observability gain), drop the line.
     Catalyst observability is the dnsdist /metrics endpoint surfaced
     to Prometheus + the k8s container log.

Bumped chart to 1.0.3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:12:27 +02:00
hatiyildiz
790fc7efb0 merge: bp-powerdns 1.0.2 (dnsdist tag + RO rootfs fix, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:06:47 +02:00
hatiyildiz
20c0543806 fix(powerdns): correct dnsdist image tag + drop readOnlyRootFilesystem (1.0.2)
Two runtime issues caught during first contabo-mkt rollout:

1. dnsdist image tag was "1.9" (default) — that tag doesn't exist in
   docker.io/powerdns/dnsdist-19. The 1.9.x line publishes 1.9.0 .. 1.9.14
   (no rolling "1.9" alias). Pinned to 1.9.14 (current latest).

2. PowerDNS pod crash-looped on Errno 30 (Read-only file system:
   /etc/powerdns/pdns.d/0-api.conf.conf). The upstream pdns_server-startup
   script writes rendered config files to /etc/powerdns/pdns.d/ at
   container start, and the upstream template doesn't expose an emptyDir
   we could redirect that path to. Set readOnlyRootFilesystem=false with
   a verbose comment explaining why; the rest of the security context
   (runAsNonRoot, runAsUser=953, drop ALL caps) stays in place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:06:39 +02:00
hatiyildiz
134b3fbedf merge: bp-powerdns 1.0.1 (dnsdist checksum fix, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:03:00 +02:00
hatiyildiz
19d926bfeb fix(powerdns): avoid recursive include in dnsdist checksum, bump to 1.0.1
Helm flagged dnsdist.yaml's checksum/config annotation as a recursive
template self-reference (the file included itself). Replaced with a
hash of the rendered .Values.dnsdist.config (post-tpl), which is the
substantive content the annotation is supposed to track anyway.

Bumped Chart.yaml to 1.0.1 so the OCIRepository semver "1.x" picks
up the fix automatically on next reconcile. Blueprint API version stays
at 1.0.0 (Blueprint contract is unchanged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:02:53 +02:00
hatiyildiz
e3a006bc6f merge: bp-powerdns wrapper + per-Sovereign zone model (closes #167 phases 1-3)
Closes #167 (public-repo phases). Cluster manifest deploy in
openova-private feat/powerdns-deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:50:16 +02:00
hatiyildiz
0190c60520 feat(powerdns): bp-powerdns wrapper chart + per-Sovereign zone model (#167)
Introduces the bp-powerdns Catalyst Blueprint wrapper as the authoritative
DNS service for every Sovereign zone. Replaces k8gb in componentGroups.ts —
PowerDNS Lua records cover geo + health-checked failover natively, removing
the dedicated GSLB controller.

Wrapper chart (platform/powerdns/chart/):
  - Chart.yaml — bp-powerdns 1.0.0, depends on pschichtel/powerdns 0.10.0
    upstream (verified Artifact Hub publisher, tracks docker.io/powerdns/
    pdns-auth-50 at appVersion 5.0.3 — surveyed Artifact Hub, no official
    PowerDNS chart exists)
  - values.yaml — 3 replicas, gpgsql backend, DNSSEC ECDSAP256SHA256,
    lua-records ON, dnsdist 100 qps default per source IP, REST API at
    pdns.openova.io/api behind Traefik basicAuth
  - blueprint.yaml — Catalyst metadata, visibility=unlisted (mandatory
    infra), section pts-3-2-gitops-and-iac
  - templates/cnpg-cluster.yaml — separate `pdns-pg` Postgres (1 instance,
    5Gi, postgres-16) with PowerDNS auth-5.0.3 schema applied via
    postInitApplicationSQL
  - templates/dnsdist.yaml — companion Deployment + ConfigMap with
    rate-limiting policy (MaxQPSIPRule per source IP)
  - templates/api-ingress.yaml — Traefik Ingress + basicAuth Middleware
  - templates/anycast-endpoint.yaml — placeholder Service of type
    LoadBalancer (Phase-0 stand-in for the anycast Floating IP target state)
  - templates/crossplane-floatingip.yaml — DISCLOSED GAP: target-state
    XHetznerFloatingIP composite, disabled by default until the
    Crossplane composition is authored (the existing compositions cover
    Server/Network/Firewall/LoadBalancer/PoolAllocation only). The
    placeholder anycast Service is the operational stand-in.

Per docs/INVIOLABLE-PRINCIPLES.md:
  - #4 (never hardcode): every value flows from values.yaml or a
    referenced K8s Secret. Image tags come from upstream chart appVersion,
    never duplicated.
  - #8 (disclose every divergence): the XHetznerFloatingIP gap is
    documented in the template + in docs/PLATFORM-POWERDNS.md ("Anycast
    deferral" section).

componentGroups.ts: powerdns added to SPINE group as mandatory (depends on
cnpg). external-dns now lists powerdns as a dependency. k8gb removed.

docs/PLATFORM-POWERDNS.md: per-Sovereign zone model, DNSSEC posture, REST
API contract, lua-records GSLB pattern, dnsdist policy, anycast deferral
runbook, first-deploy procedure for Contabo-mkt.

Closes #167 (Phase 1 of public-repo work; Phase 4 cluster manifest lands
in openova-private feat/powerdns-deploy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:49:51 +02:00
github-actions[bot]
eee3e1ec9e deploy: update catalyst images to 8a636f9 2026-04-29 05:48:55 +00:00
hatiyildiz
8a636f9f26 merge: registrar adapters for BYO Flow B (closes #170)
Cloudflare, Namecheap, GoDaddy, OVH, Dynadot adapters under a shared
Registrar interface; new POST /api/v1/registrar/{registrar}/set-ns
endpoint on PDM. 74 new unit tests; token never logged or persisted.
2026-04-29 07:46:52 +02:00
hatiyildiz
567d7e1f60 feat(pdm): registrar adapters for Cloudflare, Namecheap, GoDaddy, OVH, Dynadot (#170)
Adds the BYO Flow B (#166) registrar-flip seam: PDM now exposes a
provider-agnostic Registrar interface and 5 adapter implementations
plus a new HTTP endpoint that dispatches to them.

Wire surface
- POST /api/v1/registrar/{registrar}/set-ns
  Body: {"domain":"...","token":"...","nameservers":["..."]}
  Reply: {"success":true,"registrar":"...","domain":"...",
          "nameservers":["..."],"propagation":"..."}
- GET /healthz now lists the wired-in registrar names.

Interface (internal/registrar/registrar.go)
- Name(), ValidateToken, SetNameservers, GetNameservers
- Typed errors: ErrInvalidToken, ErrRateLimited, ErrDomainNotInAccount,
  ErrAPIUnavailable, ErrUnsupportedRegistrar
- Registry map[string]Registrar with Lookup + Names helpers

Adapters
- internal/registrar/cloudflare/  — API v4 with Bearer token; verifies
  via /user/tokens/verify, looks up zone by name, PATCHes name_servers
- internal/registrar/namecheap/   — XML API; ApiUser+ApiKey+UserName+
  ClientIp auth; getBalances probe + getList domain check; setCustom
  for write. IP-whitelisting requirement documented in source comments
- internal/registrar/godaddy/     — v1 API with sso-key auth;
  GET /v1/domains list + PATCH /v1/domains/{d} with nameServers body
- internal/registrar/ovh/         — request signing (HMAC-SHA1 over
  appSecret+consumerKey+method+url+body+timestamp); GET /domain probe;
  POST /domain/{d}/nameServers/update for write; GET .../nameServer[/{id}]
  for read
- internal/registrar/dynadot/     — api3.json with key+secret as colon-
  separated token; uses set_ns + domain_info commands. Distinct from
  the existing internal/dynadot package which is the DNS-record writer
  for OpenOva-managed pool domains (different concern: pool DNS vs.
  customer-domain registrar NS-flip)

Token hygiene (per docs/INVIOLABLE-PRINCIPLES.md #10)
- Tokens never persisted: in-memory only for the duration of the call
- Never logged: handler uses classifyOutcome to render redacted
  outcome labels, never the raw error message or token
- Never echoed in responses
- TestSetNSResponseDoesNotEchoToken + TestSetNSHappy assert no token
  bytes appear in JSON body or zerolog/slog output

Tests
- 74 new unit tests (httptest server per adapter):
  cloudflare 11, dynadot 11, godaddy 11, namecheap 13, ovh 12,
  handler 14, registrar interface 2
- Each adapter covers: happy path, bad-token, rate-limited (429),
  bad-domain (404 / not-in-account), empty-NS guard, name+default
- OVH signature math verified deterministically via injected nowFn

Acceptance (issue #170)
- All 5 adapters pass their unit tests
- PDM /api/v1/registrar/{r}/set-ns endpoint live
- Wired into cmd/pdm/main.go: every adapter registered at startup

Per docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode), each adapter's
BaseURL is constructor-default + struct-overridable, so tests inject
httptest endpoints without environment shenanigans.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:46:30 +02:00
github-actions[bot]
5a13be8559 deploy: update catalyst images to 2854d65 2026-04-29 04:54:25 +00:00
Emrah Baysal
2854d652eb merge: pool-domain-manager (closes #163 phases 1-4)
Brings the pool-domain-manager service, catalyst-api integration, CI
workflow, and Crossplane Composition onto main. Phase 5 (deploy) lands
as a separate openova-private commit; Phase 6 (verification curl)
follows once the image is published and the Flux reconciliation cycle
finishes.
2026-04-29 06:46:31 +02:00
hatiyildiz
31b03ce02a ci(pdm)+platform(crossplane): build workflow + XDynadotPoolAllocation composition (Phase 3+4 of #163)
CI workflow (.github/workflows/pool-domain-manager-build.yaml) mirrors
the marketplace-api / catalyst-api shape:

  - Triggers on push to core/pool-domain-manager/** + workflow_dispatch
  - Runs unit tests (reserved + dynadot — the integration suite needs a
    real Postgres which the workflow does not provide; full integration
    runs in test-bootstrap-api.yaml against an ephemeral CNPG)
  - Builds and pushes ghcr.io/openova-io/openova/pool-domain-manager:<sha>
  - Cosign-signs the image via Sigstore keyless OIDC (id-token: write)
  - Emits an SBOM attestation tied to the image digest
  - Manifest deployment is intentionally NOT in this workflow — PDM
    manifests live in the openova-private repo per the issue body, so
    the Flux Kustomization there picks up the new SHA via a follow-up
    private-repo commit (Phase 6 of #163)

Crossplane composition (platform/crossplane/compositions/xrd-pool-
allocation.yaml + composition-pool-allocation.yaml) wraps PDM as a
declarative Crossplane Resource:

  apiVersion: compose.openova.io/v1alpha1
  kind: XDynadotPoolAllocation
  spec:
    parameters:
      poolDomain:    omani.works
      subdomain:     omantel
      sovereignFQDN: omantel.omani.works
      loadBalancerIP: 1.2.3.4
      createdBy:     crossplane

The Composition uses provider-http (crossplane-contrib/provider-http) to
render the XR into a Reserve → Commit sequence of HTTP calls against
PDM's in-cluster service URL. Per docs/INVIOLABLE-PRINCIPLES.md #3 we use
provider-http rather than bespoke Go to keep the day-2 lifecycle
declarative. Operators who want to pre-allocate a name (e.g. reserve
'omantel.omani.works' for a Sovereign that hasn't been provisioned yet)
commit YAML to Git and Flux+Crossplane converge.

Refs: #163
2026-04-29 06:46:11 +02:00
hatiyildiz
01183cb44c feat(catalyst-api): wire pool-domain-manager into the wizard lifecycle (Phase 2 of #163)
The wizard's StepDomain debounced check, the deployment-creation reserve,
the post-tofu-apply commit, and the on-failure release now all flow
through the pool-domain-manager service that landed in the previous
commit. The DNS-wildcard regression at omani.works (where every
subdomain resolved to 185.53.179.128 because of the apex parking record
and broke the LookupHost-based check) is now FIXED STRUCTURALLY:

  - Managed pools: route through PDM, which has zero DNS dependency.
  - BYO domains:   keep the legacy LookupHost path because the customer
                   owns the zone — that nameserver IS the source of truth.

Files changed:

  internal/pdm/client.go (new)
    Tiny HTTP client for PDM (Check, Reserve, Commit, Release) plus a
    package-level IsManagedDomain runtime resolver that mirrors the legacy
    catalyst-api dynadot.IsManagedDomain semantics WITHOUT importing the
    dynadot package. The DYNADOT_MANAGED_DOMAINS env var is the contract;
    PDM is the writer of any actual Dynadot side-effect.

  internal/handler/handler.go
    New(...) reads POOL_DOMAIN_MANAGER_URL from env (default = in-cluster
    service FQDN). NewWithPDM(client) is exposed for tests so a fake can
    be injected without spinning up a real HTTP server. Per docs/INVIOLABLE-
    PRINCIPLES.md #4 the URL is configuration, not code.

  internal/handler/subdomains.go (rewritten)
    Removed: net.LookupHost on '<sub>.<pool>' for managed pools. Removed:
    duplicate reservedSubdomains map (lives ONLY in PDM now). Added:
    h.checkManagedPool() that delegates to PDM and surfaces PDM's
    Available/Reason/Detail verbatim. Added: h.checkBYO() that keeps the
    legacy DNS path for non-managed domains. Defence in depth: when PDM
    URL is misconfigured the handler returns reason='pdm-unavailable'
    rather than silently falling back to DNS (which would resurrect the
    wildcard bug).

  internal/handler/deployments.go
    CreateDeployment now reserves the pool subdomain via PDM BEFORE
    launching the runProvisioning goroutine, captures the
    reservation_token onto the Deployment struct, and returns 409 on
    PDM ErrConflict so the wizard's StepReview can surface the race
    cleanly. runProvisioning issues PDM /commit on success (with the
    LB IP) or /release on failure. PDM owns the eventual Dynadot write —
    catalyst-api never calls api.dynadot.com directly for the wizard's
    lifecycle after this lands.

  internal/handler/{subdomains,deployments}_test.go (new)
    Subdomains: prove (a) managed pool delegates to PDM and surfaces
    PDM's response verbatim, (b) DNS-wildcard parking records cannot
    cause Available=false for any random subdomain (regression guard
    for #163), (c) PDM returns active-state → handler returns
    Available=false with the right reason, (d) BYO falls back to DNS
    correctly, (e) invalid label short-circuits before PDM is called,
    (f) PDM unavailable surfaces 'pdm-unavailable' rather than
    silently succeeding.
    Deployments: prove (a) managed pool reserves via PDM exactly once,
    (b) PDM 409 conflict on reserve blocks the deployment with HTTP
    409, (c) BYO mode does NOT consult PDM.

Architectural compliance:

  - Principle #4 (never hardcode): every URL/domain/region is runtime
    configuration. POOL_DOMAIN_MANAGER_URL has a sane default so the
    common case 'just works' but is overridable for air-gap installs.
  - Principle #2 (no quality compromise): the PDM lifecycle is the
    target-state shape. Reserve before tofu apply guarantees a name
    can't be double-allocated by a parallel wizard tab. Commit AFTER
    tofu apply guarantees we don't write DNS for a Sovereign that
    doesn't exist yet.
  - Lesson #24 (don't bypass off-the-shelf primitives): the catalyst-api
    no longer carries its own copy of the reserved-name list, no longer
    calls Dynadot directly for the lifecycle, and no longer does DNS-
    based availability checks for managed pools. PDM IS the off-the-
    shelf primitive for this concern; we use it.

Refs: #163
2026-04-29 06:44:22 +02:00
hatiyildiz
585b046f5d feat(pdm): pool-domain-manager service skeleton (Phase 1 of #163)
Build a new Go service core/pool-domain-manager that becomes the SOLE
authority for OpenOva-pool subdomain allocation across the fleet.

Why this exists: today products/catalyst/bootstrap/api/internal/handler/
subdomains.go does naive net.LookupHost() to decide whether a candidate
subdomain is taken. Dynadot's wildcard parking record at the apex of
omani.works (and any future pool domain) makes EVERY subdomain resolve
to 185.53.179.128, so the check rejects everything. DNS is the wrong
source of truth for an OpenOva-managed pool — the central control plane
must own the allocation table.

What this commit adds (no integration with catalyst-api yet — that lands
in a follow-up commit):

  core/pool-domain-manager/
    cmd/pdm/main.go                     chi router, healthz, sweeper boot
    api/openapi.yaml                     wire contract for every endpoint
    Containerfile                        alpine final stage, UID 65534
    internal/store/                      pgx + CNPG; pool_allocations table
      migrations.sql                       idempotent CREATE TABLE schema
      store.go                             Reserve/Get/Commit/Release/List
      store_test.go                        integration tests (PDM_TEST_DSN)
    internal/dynadot/                    moved + extended; SOLE Dynadot caller
      dynadot.go                           AddRecord, AddSovereignRecords,
                                           DeleteSubdomainRecords (read-modify-
                                           write to honour feedback_dynadot_dns)
      dynadot_test.go                      managed-domain resolution tests
    internal/reserved/                   centralised reserved-name list
      reserved.go                          IsReserved/All; pulled out of
                                           catalyst-api's subdomains.go
    internal/handler/                    HTTP surface
      handler.go                           /api/v1/pool/{domain}/{check,reserve,
                                           commit,release,list}, /healthz,
                                           /api/v1/reserved
    internal/allocator/                  state machine + sweeper goroutine

Architecture choices and how they map to docs/INVIOLABLE-PRINCIPLES.md:

  - Principle #4 (never hardcode): every value (PORT, PDM_DATABASE_URL,
    DYNADOT_MANAGED_DOMAINS, PDM_RESERVATION_TTL, PDM_SWEEPER_INTERVAL)
    flows from env vars; the K8s ExternalSecret will populate them at
    deploy time. The reserved-subdomain list lives in ONE place
    (internal/reserved); catalyst-api will not duplicate it.

  - Principle #2 (no quality compromise): the state machine commits the
    DB row before the Dynadot side-effect, so a crash between the two
    leaves the system in a recoverable state (operator runs Release).
    The reservation_token in the row protects against stale-tab commit
    races. UPSERT semantics + a CHECK constraint mean two operators
    racing /reserve get a clean 23505 (unique_violation) → HTTP 409.

  - Principle #3 (follow architecture): PDM is a ClusterIP service in
    openova-system — it is not a Crossplane provider, not a Flux
    HelmRelease, not bespoke OpenTofu state. catalyst-api speaks to it
    via plain HTTP. The Crossplane Composition that wraps PDM as a
    declarative MR (XDynadotPoolAllocation) lands in a follow-up phase.

The DNS-wildcard problem the issue describes is fixed STRUCTURALLY here:
PDM never calls net.LookupHost. The /check path is a single SELECT
against pool_allocations. omani.works's wildcard A record at the apex
becomes architecturally irrelevant.

Tests exercised in this commit:
  - internal/reserved: full unit coverage (case-insensitive, sorted, set
    membership)
  - internal/dynadot: managed-domain runtime resolution (env-var,
    legacy single-domain fallback, built-in defaults, list parsing)
  - internal/store: integration suite gated on PDM_TEST_DSN env var,
    covers reserve happy-path, reserve race (ErrConflict), TTL expiry
    frees, commit happy-path, commit token mismatch, release removes
    row, sweeper deletes expired rows

Closes phase 1 of #163. Phase 2 (catalyst-api wiring), Phase 3 (CI +
manifests), Phase 4 (Crossplane composition), Phase 6 (deploy +
verification curl) follow in separate commits.

Refs: #163
2026-04-29 06:37:38 +02:00
github-actions[bot]
296fd68819 deploy: update catalyst images to 16d837b 2026-04-29 04:36:47 +00:00
hatiyildiz
16d837bb81 merge: #162 — wizard StepComponents UX polish (cards + logos + tabs + brand mark)
Closes operator-facing UX gaps from issue #162:

- Phase 1: card surface pixel-matches SME marketplace AppsStep.svelte
  (height 108px, hover-reveal toggle, mask-gradient chips, etc.)
- Phase 2: 63 vendored component-logo SVGs under /component-logos/
  with logoUrl field on ComponentDef defaulting to data-driven path.
  docs/COMPONENT-LOGOS.md tracks each upstream source + licence.
- Phase 3: two-tab segregation. 'Choose Your Stack' (recommended +
  optional, search + filter + cascade-aware toggle) and
  'Always Included' (mandatory only, grouped by product, read-only,
  INFRASTRUCTURE pill). Counts derived from componentGroups, never
  hardcoded.
- Phase 4: OpenOva brand mark in wizard top bar at 32px, with
  /openova-logo.svg vendored under public/ for non-React surfaces.

Build: typecheck clean, 81 vitest tests passing (up from 65), prod
build successful with 63 logos + brand mark in dist/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:34:23 +02:00
hatiyildiz
8ee52e6500 feat(wizard): #162 OpenOva brand mark on wizard top bar
Phase 4 of issue #162.

The wizard's top bar already rendered the OpenOva infinity-loop mark
via <OOLogo h={22} />. This bumps it to the spec'd 32px height and
adds a `data-testid="wizard-logo"` hook for end-to-end tests.

Also vendors the canonical brand SVG to
`products/catalyst/bootstrap/ui/public/openova-logo.svg` (sourced from
the marketing repo's logo-icon.svg). Static pages bundled with the
wizard (e.g. provision.html) and any future non-React surface can now
reference `/openova-logo.svg` directly without duplicating the path
data — single source of truth for the brand mark.

The link target is unchanged: `/app/dashboard` for SaaS, `/` for
self-hosted (which redirects to the wizard root). Both effectively
land back at the wizard home, matching the issue's "link to wizard
home" requirement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:32:56 +02:00
hatiyildiz
c3b36cb170 feat(wizard): #162 SME-pixel-perfect cards + two-tab segregation
Phases 1 + 3 of issue #162.

Phase 1 — pixel-match the SME marketplace card surface:
  - Card height fixed at 108px, overflow hidden, 12px radius
  - Logo column align-self: stretch + aspect-ratio 1/1 → fills card height
  - Body padding-right: 4.5rem reserves room for top-right toggle
    button + bottom-right SELECTED pill
  - Toggle button hidden by default, opacity 0 → 1 + scale(0.8) → 1
    on hover (matches AppsStep.svelte .app-add-btn transition)
  - Hover: translateY(-2px) + accent border + 0 4px 16px shadow
  - Selected: green border + green-tinted bg
  - Chips use mask-image gradient on right edge
  - Toolbar split into rows: search row + chip row
  - Toast slide-in keyframe matches SME exactly
  - All visual rules consolidated into one <style> block via .corp-comp-*
    classes so the card surface has a single source of truth, replacing
    the previous inline-style sprinkle.

Phase 3 — two-tab segregation (operator preference: Option A):
  - Top of step renders two tabs:
      "Choose Your Stack (N)"  — N = recommended + optional currently
                                 selected (computed from store)
      "Always Included (M)"    — M = catalog mandatory count (computed
                                 from componentGroups, never hardcoded)
  - Tab 1 body: only non-mandatory components, search + category chip
    filter, sort-selected-first, all current cascade-add/remove logic
    intact. Mandatory cards do NOT appear in this tab — they were
    confusing as "locked toggles" alongside selectable components.
  - Tab 2 body: only mandatory components, grouped by product
    (PILOT, GUARDIAN, …). No search, no category chips, no toggle UI.
    "INFRASTRUCTURE" pill replaces the MANDATORY pill so users read it
    as platform infra rather than a wizard option. Subdued text colors
    (var(--wiz-text-md)) for body so the section reads informational.
  - Tab 2 header carries the foundational-platform blurb verbatim:
    "These platform components run on every Sovereign. They're
    foundational — you don't pay extra for them."
  - Tab switch is local state — selection store untouched, so
    Continue's dependency-consistency validation continues to read the
    full selectedComponents set unchanged.

Cards in Tab 1 now load `<img src={entry.logoUrl}>` against the
vendored SVGs from Phase 2 (commit 979ff59). The IconFallback letter
mark stays as a defensive fallback when logoUrl is null.

Vitest coverage rewritten to match the two-tab contract:
  - Tabs render + switch state + counters
  - Tab 1 only shows non-mandatory cards
  - Tab 2 only shows mandatory cards, grouped by product
  - Tab 2 has no search, no chips, no toggle
  - Logo URLs default to /component-logos/<id>.svg
Total: 81 tests passing (up from 65).

Per docs/INVIOLABLE-PRINCIPLES.md #4 — every count, tier, label is
derived from componentGroups.ts. The "Always Included (28)" example in
the issue body is replaced by a runtime-computed count so the badge
stays correct as the catalog evolves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:32:45 +02:00
hatiyildiz
979ff59332 feat(wizard): vendor 63 component logos + add ComponentDef.logoUrl
Phase 2 of issue #162. Each component now resolves to a real SVG mark
under products/catalyst/bootstrap/ui/public/component-logos/<id>.svg
instead of the letter-pill fallback. ComponentDef gains an optional
logoUrl field that defaults to /component-logos/<id>.svg per id, so
swapping a file under public/ rebrands the card without touching
application source (per INVIOLABLE-PRINCIPLES.md #4 "never hardcode").

The 63 SVGs are stylised brand-color marks, not copies of the upstream
projects' trademarked logotypes — this avoids licence ambiguity in a
public repo while still giving the wizard a visually distinctive
component grid. docs/COMPONENT-LOGOS.md tracks the canonical upstream
source for each component (CNCF artwork repo, project brand pages,
etc.) so the asset library can be audited and individual files swapped
for official art when permission/license is verified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:29:23 +02:00
github-actions[bot]
4518ad67d8 deploy: update catalyst images to 3a0ffb0 2026-04-28 21:15:11 +00:00
hatiyildiz
3a0ffb0006 merge: #161 — corporate platform component grid in wizard StepComponents
Brings the SME marketplace UX pattern (search + category chips +
flat sort-selected-first card grid) to the corporate Sovereign-bootstrap
wizard's StepComponents page, backed by the 60+ platform component
catalog in componentGroups.ts with dependency-aware cascading
selection.

  • componentGroups.ts — adds dependencies graph (Harbor → cnpg+seaweedfs
    +valkey, Keycloak/Gitea → cnpg, Velero/Loki/Mimir/Tempo → seaweedfs,
    External Secrets → openbao, cert-manager/k8gb → external-dns, …)
    plus catalog helpers (findComponent, resolveTransitiveDependencies,
    resolveTransitiveDependents, computeDefaultSelection).

  • StepComponents.tsx — flat grid mirroring AppsStep.svelte. Tier
    badges (MANDATORY/RECOMMENDED/OPTIONAL), "Includes:" hints, toast
    notifications, cascade-aware confirm modal on destructive removes.

  • Wizard store — selectedComponents repurposed as sorted, deduped
    string[], with addComponent/removeComponent walking the dependency
    graph; mandatory ids cannot be removed; persist.merge() seeds
    defaults on first run and force-keeps mandatory ids.

  • 38 new vitest tests covering catalog sanity, search, category
    filter, sort, mandatory rejection, cascading add/remove, store
    invariants, reverse-graph helpers (54 total in StepComponents.test).

  • catalog.generated.ts untouched — StepProvisioning timeline still
    reads the same 10 platform-infra blueprint entries for SSE phase
    labelling.

Refs: GitHub issue #161
2026-04-28 23:13:00 +02:00
hatiyildiz
e8b095db34 test(wizard-step5): #161 phase 4 — vitest coverage for component grid
38 new tests across 9 describe blocks covering every behaviour the
corporate component grid promises:

  catalog sanity (6)
    - 60+ components, every tier valid, every dep edge points at a real
      catalog id, Harbor → cnpg+seaweedfs+valkey, OpenSearch has no deps,
      and Reloader / KEDA / VPA / Cilium / Crossplane / Flux are dep-free.

  card grid (4)
    - one card rendered per catalog entry, "Selected (N)" counter live,
      "Includes:" hint visible for components with deps, MANDATORY tier
      badge shown for mandatory cards.

  search filter (3)
    - narrows by name, by group name, shows empty state on no match.

  category filter (3)
    - 9 group chips + "All", chip click narrows the grid, second click
      clears the filter.

  sort: selected first (1)
    - newly-selected components float to the top of their group.

  mandatory cards (2)
    - clicking them never deselects, emits a "mandatory" toast.

  cascading add (4)
    - addComponent('milvus') pulls in seaweedfs, addComponent('harbor')
      pulls in cnpg+seaweedfs+valkey, the UI emits a single toast naming
      every cascaded dep, action is idempotent.

  cascading remove (4)
    - confirm dialog opens with the impact set listed, cancel keeps the
      component selected, confirm cascades through the entire impact
      set, mandatory ids stay even when their dep is removed.

  store invariants (5)
    - selectedComponents always sorted, de-duplicated by
      setSelectedComponents, legacy SelectedComponent[] is normalised to
      ids by setComponents, resetSelectedComponentsToDefault restores
      mandatory + recommended + their deps, every mandatory id is in the
      default selection along with its transitive deps.

  reset to defaults (1) + reverse-graph helpers (3)

Test results: 54 passed, 0 failed (37 new + 17 from StepSuccess.test.tsx).
typecheck clean, build succeeds (10 platform-infra blueprints in
catalog.generated.ts as expected).

Refs: GitHub issue #161

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:11:54 +02:00