Commit Graph

366 Commits

Author SHA1 Message Date
hatiyildiz
036dc39800 merge: bp-powerdns 1.0.3 (dnsdist backend env-injection + table ownership, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:13:54 +02:00
hatiyildiz
db20e9d42b fix(powerdns): dnsdist backend resolution + drop DnstapLogAction (1.0.3)
dnsdist 1.9.14 runtime errors:
  1. newServer{address='powerdns:5353'} → "Unable to convert presentation
     address" — dnsdist's address parser expects IP[:port], not a DNS
     name. Kubernetes auto-injects POWERDNS_SERVICE_HOST as an env var
     into every pod in the same namespace as the powerdns Service; using
     that gives us the ClusterIP at config-load time without needing an
     init container or runtime DNS resolution.
  2. DnstapLogAction(name, bool, fn) signature changed in 1.9 — the
     2nd parameter now expects a shared_ptr to a RemoteLoggerInterface,
     not a boolean. Rather than wire up a remote dnstap server (which
     adds a moving part for marginal observability gain), drop the line.
     Catalyst observability is the dnsdist /metrics endpoint surfaced
     to Prometheus + the k8s container log.

Bumped chart to 1.0.3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:12:27 +02:00
hatiyildiz
790fc7efb0 merge: bp-powerdns 1.0.2 (dnsdist tag + RO rootfs fix, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:06:47 +02:00
hatiyildiz
20c0543806 fix(powerdns): correct dnsdist image tag + drop readOnlyRootFilesystem (1.0.2)
Two runtime issues caught during first contabo-mkt rollout:

1. dnsdist image tag was "1.9" (default) — that tag doesn't exist in
   docker.io/powerdns/dnsdist-19. The 1.9.x line publishes 1.9.0 .. 1.9.14
   (no rolling "1.9" alias). Pinned to 1.9.14 (current latest).

2. PowerDNS pod crash-looped on Errno 30 (Read-only file system:
   /etc/powerdns/pdns.d/0-api.conf.conf). The upstream pdns_server-startup
   script writes rendered config files to /etc/powerdns/pdns.d/ at
   container start, and the upstream template doesn't expose an emptyDir
   we could redirect that path to. Set readOnlyRootFilesystem=false with
   a verbose comment explaining why; the rest of the security context
   (runAsNonRoot, runAsUser=953, drop ALL caps) stays in place.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:06:39 +02:00
hatiyildiz
134b3fbedf merge: bp-powerdns 1.0.1 (dnsdist checksum fix, openova#167)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:03:00 +02:00
hatiyildiz
19d926bfeb fix(powerdns): avoid recursive include in dnsdist checksum, bump to 1.0.1
Helm flagged dnsdist.yaml's checksum/config annotation as a recursive
template self-reference (the file included itself). Replaced with a
hash of the rendered .Values.dnsdist.config (post-tpl), which is the
substantive content the annotation is supposed to track anyway.

Bumped Chart.yaml to 1.0.1 so the OCIRepository semver "1.x" picks
up the fix automatically on next reconcile. Blueprint API version stays
at 1.0.0 (Blueprint contract is unchanged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 08:02:53 +02:00
hatiyildiz
e3a006bc6f merge: bp-powerdns wrapper + per-Sovereign zone model (closes #167 phases 1-3)
Closes #167 (public-repo phases). Cluster manifest deploy in
openova-private feat/powerdns-deploy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:50:16 +02:00
hatiyildiz
0190c60520 feat(powerdns): bp-powerdns wrapper chart + per-Sovereign zone model (#167)
Introduces the bp-powerdns Catalyst Blueprint wrapper as the authoritative
DNS service for every Sovereign zone. Replaces k8gb in componentGroups.ts —
PowerDNS Lua records cover geo + health-checked failover natively, removing
the dedicated GSLB controller.

Wrapper chart (platform/powerdns/chart/):
  - Chart.yaml — bp-powerdns 1.0.0, depends on pschichtel/powerdns 0.10.0
    upstream (verified Artifact Hub publisher, tracks docker.io/powerdns/
    pdns-auth-50 at appVersion 5.0.3 — surveyed Artifact Hub, no official
    PowerDNS chart exists)
  - values.yaml — 3 replicas, gpgsql backend, DNSSEC ECDSAP256SHA256,
    lua-records ON, dnsdist 100 qps default per source IP, REST API at
    pdns.openova.io/api behind Traefik basicAuth
  - blueprint.yaml — Catalyst metadata, visibility=unlisted (mandatory
    infra), section pts-3-2-gitops-and-iac
  - templates/cnpg-cluster.yaml — separate `pdns-pg` Postgres (1 instance,
    5Gi, postgres-16) with PowerDNS auth-5.0.3 schema applied via
    postInitApplicationSQL
  - templates/dnsdist.yaml — companion Deployment + ConfigMap with
    rate-limiting policy (MaxQPSIPRule per source IP)
  - templates/api-ingress.yaml — Traefik Ingress + basicAuth Middleware
  - templates/anycast-endpoint.yaml — placeholder Service of type
    LoadBalancer (Phase-0 stand-in for the anycast Floating IP target state)
  - templates/crossplane-floatingip.yaml — DISCLOSED GAP: target-state
    XHetznerFloatingIP composite, disabled by default until the
    Crossplane composition is authored (the existing compositions cover
    Server/Network/Firewall/LoadBalancer/PoolAllocation only). The
    placeholder anycast Service is the operational stand-in.

Per docs/INVIOLABLE-PRINCIPLES.md:
  - #4 (never hardcode): every value flows from values.yaml or a
    referenced K8s Secret. Image tags come from upstream chart appVersion,
    never duplicated.
  - #8 (disclose every divergence): the XHetznerFloatingIP gap is
    documented in the template + in docs/PLATFORM-POWERDNS.md ("Anycast
    deferral" section).

componentGroups.ts: powerdns added to SPINE group as mandatory (depends on
cnpg). external-dns now lists powerdns as a dependency. k8gb removed.

docs/PLATFORM-POWERDNS.md: per-Sovereign zone model, DNSSEC posture, REST
API contract, lua-records GSLB pattern, dnsdist policy, anycast deferral
runbook, first-deploy procedure for Contabo-mkt.

Closes #167 (Phase 1 of public-repo work; Phase 4 cluster manifest lands
in openova-private feat/powerdns-deploy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:49:51 +02:00
github-actions[bot]
eee3e1ec9e deploy: update catalyst images to 8a636f9 2026-04-29 05:48:55 +00:00
hatiyildiz
8a636f9f26 merge: registrar adapters for BYO Flow B (closes #170)
Cloudflare, Namecheap, GoDaddy, OVH, Dynadot adapters under a shared
Registrar interface; new POST /api/v1/registrar/{registrar}/set-ns
endpoint on PDM. 74 new unit tests; token never logged or persisted.
2026-04-29 07:46:52 +02:00
hatiyildiz
567d7e1f60 feat(pdm): registrar adapters for Cloudflare, Namecheap, GoDaddy, OVH, Dynadot (#170)
Adds the BYO Flow B (#166) registrar-flip seam: PDM now exposes a
provider-agnostic Registrar interface and 5 adapter implementations
plus a new HTTP endpoint that dispatches to them.

Wire surface
- POST /api/v1/registrar/{registrar}/set-ns
  Body: {"domain":"...","token":"...","nameservers":["..."]}
  Reply: {"success":true,"registrar":"...","domain":"...",
          "nameservers":["..."],"propagation":"..."}
- GET /healthz now lists the wired-in registrar names.

Interface (internal/registrar/registrar.go)
- Name(), ValidateToken, SetNameservers, GetNameservers
- Typed errors: ErrInvalidToken, ErrRateLimited, ErrDomainNotInAccount,
  ErrAPIUnavailable, ErrUnsupportedRegistrar
- Registry map[string]Registrar with Lookup + Names helpers

Adapters
- internal/registrar/cloudflare/  — API v4 with Bearer token; verifies
  via /user/tokens/verify, looks up zone by name, PATCHes name_servers
- internal/registrar/namecheap/   — XML API; ApiUser+ApiKey+UserName+
  ClientIp auth; getBalances probe + getList domain check; setCustom
  for write. IP-whitelisting requirement documented in source comments
- internal/registrar/godaddy/     — v1 API with sso-key auth;
  GET /v1/domains list + PATCH /v1/domains/{d} with nameServers body
- internal/registrar/ovh/         — request signing (HMAC-SHA1 over
  appSecret+consumerKey+method+url+body+timestamp); GET /domain probe;
  POST /domain/{d}/nameServers/update for write; GET .../nameServer[/{id}]
  for read
- internal/registrar/dynadot/     — api3.json with key+secret as colon-
  separated token; uses set_ns + domain_info commands. Distinct from
  the existing internal/dynadot package which is the DNS-record writer
  for OpenOva-managed pool domains (different concern: pool DNS vs.
  customer-domain registrar NS-flip)

Token hygiene (per docs/INVIOLABLE-PRINCIPLES.md #10)
- Tokens never persisted: in-memory only for the duration of the call
- Never logged: handler uses classifyOutcome to render redacted
  outcome labels, never the raw error message or token
- Never echoed in responses
- TestSetNSResponseDoesNotEchoToken + TestSetNSHappy assert no token
  bytes appear in JSON body or zerolog/slog output

Tests
- 74 new unit tests (httptest server per adapter):
  cloudflare 11, dynadot 11, godaddy 11, namecheap 13, ovh 12,
  handler 14, registrar interface 2
- Each adapter covers: happy path, bad-token, rate-limited (429),
  bad-domain (404 / not-in-account), empty-NS guard, name+default
- OVH signature math verified deterministically via injected nowFn

Acceptance (issue #170)
- All 5 adapters pass their unit tests
- PDM /api/v1/registrar/{r}/set-ns endpoint live
- Wired into cmd/pdm/main.go: every adapter registered at startup

Per docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode), each adapter's
BaseURL is constructor-default + struct-overridable, so tests inject
httptest endpoints without environment shenanigans.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 07:46:30 +02:00
github-actions[bot]
5a13be8559 deploy: update catalyst images to 2854d65 2026-04-29 04:54:25 +00:00
Emrah Baysal
2854d652eb merge: pool-domain-manager (closes #163 phases 1-4)
Brings the pool-domain-manager service, catalyst-api integration, CI
workflow, and Crossplane Composition onto main. Phase 5 (deploy) lands
as a separate openova-private commit; Phase 6 (verification curl)
follows once the image is published and the Flux reconciliation cycle
finishes.
2026-04-29 06:46:31 +02:00
hatiyildiz
31b03ce02a ci(pdm)+platform(crossplane): build workflow + XDynadotPoolAllocation composition (Phase 3+4 of #163)
CI workflow (.github/workflows/pool-domain-manager-build.yaml) mirrors
the marketplace-api / catalyst-api shape:

  - Triggers on push to core/pool-domain-manager/** + workflow_dispatch
  - Runs unit tests (reserved + dynadot — the integration suite needs a
    real Postgres which the workflow does not provide; full integration
    runs in test-bootstrap-api.yaml against an ephemeral CNPG)
  - Builds and pushes ghcr.io/openova-io/openova/pool-domain-manager:<sha>
  - Cosign-signs the image via Sigstore keyless OIDC (id-token: write)
  - Emits an SBOM attestation tied to the image digest
  - Manifest deployment is intentionally NOT in this workflow — PDM
    manifests live in the openova-private repo per the issue body, so
    the Flux Kustomization there picks up the new SHA via a follow-up
    private-repo commit (Phase 6 of #163)

Crossplane composition (platform/crossplane/compositions/xrd-pool-
allocation.yaml + composition-pool-allocation.yaml) wraps PDM as a
declarative Crossplane Resource:

  apiVersion: compose.openova.io/v1alpha1
  kind: XDynadotPoolAllocation
  spec:
    parameters:
      poolDomain:    omani.works
      subdomain:     omantel
      sovereignFQDN: omantel.omani.works
      loadBalancerIP: 1.2.3.4
      createdBy:     crossplane

The Composition uses provider-http (crossplane-contrib/provider-http) to
render the XR into a Reserve → Commit sequence of HTTP calls against
PDM's in-cluster service URL. Per docs/INVIOLABLE-PRINCIPLES.md #3 we use
provider-http rather than bespoke Go to keep the day-2 lifecycle
declarative. Operators who want to pre-allocate a name (e.g. reserve
'omantel.omani.works' for a Sovereign that hasn't been provisioned yet)
commit YAML to Git and Flux+Crossplane converge.

Refs: #163
2026-04-29 06:46:11 +02:00
hatiyildiz
01183cb44c feat(catalyst-api): wire pool-domain-manager into the wizard lifecycle (Phase 2 of #163)
The wizard's StepDomain debounced check, the deployment-creation reserve,
the post-tofu-apply commit, and the on-failure release now all flow
through the pool-domain-manager service that landed in the previous
commit. The DNS-wildcard regression at omani.works (where every
subdomain resolved to 185.53.179.128 because of the apex parking record
and broke the LookupHost-based check) is now FIXED STRUCTURALLY:

  - Managed pools: route through PDM, which has zero DNS dependency.
  - BYO domains:   keep the legacy LookupHost path because the customer
                   owns the zone — that nameserver IS the source of truth.

Files changed:

  internal/pdm/client.go (new)
    Tiny HTTP client for PDM (Check, Reserve, Commit, Release) plus a
    package-level IsManagedDomain runtime resolver that mirrors the legacy
    catalyst-api dynadot.IsManagedDomain semantics WITHOUT importing the
    dynadot package. The DYNADOT_MANAGED_DOMAINS env var is the contract;
    PDM is the writer of any actual Dynadot side-effect.

  internal/handler/handler.go
    New(...) reads POOL_DOMAIN_MANAGER_URL from env (default = in-cluster
    service FQDN). NewWithPDM(client) is exposed for tests so a fake can
    be injected without spinning up a real HTTP server. Per docs/INVIOLABLE-
    PRINCIPLES.md #4 the URL is configuration, not code.

  internal/handler/subdomains.go (rewritten)
    Removed: net.LookupHost on '<sub>.<pool>' for managed pools. Removed:
    duplicate reservedSubdomains map (lives ONLY in PDM now). Added:
    h.checkManagedPool() that delegates to PDM and surfaces PDM's
    Available/Reason/Detail verbatim. Added: h.checkBYO() that keeps the
    legacy DNS path for non-managed domains. Defence in depth: when PDM
    URL is misconfigured the handler returns reason='pdm-unavailable'
    rather than silently falling back to DNS (which would resurrect the
    wildcard bug).

  internal/handler/deployments.go
    CreateDeployment now reserves the pool subdomain via PDM BEFORE
    launching the runProvisioning goroutine, captures the
    reservation_token onto the Deployment struct, and returns 409 on
    PDM ErrConflict so the wizard's StepReview can surface the race
    cleanly. runProvisioning issues PDM /commit on success (with the
    LB IP) or /release on failure. PDM owns the eventual Dynadot write —
    catalyst-api never calls api.dynadot.com directly for the wizard's
    lifecycle after this lands.

  internal/handler/{subdomains,deployments}_test.go (new)
    Subdomains: prove (a) managed pool delegates to PDM and surfaces
    PDM's response verbatim, (b) DNS-wildcard parking records cannot
    cause Available=false for any random subdomain (regression guard
    for #163), (c) PDM returns active-state → handler returns
    Available=false with the right reason, (d) BYO falls back to DNS
    correctly, (e) invalid label short-circuits before PDM is called,
    (f) PDM unavailable surfaces 'pdm-unavailable' rather than
    silently succeeding.
    Deployments: prove (a) managed pool reserves via PDM exactly once,
    (b) PDM 409 conflict on reserve blocks the deployment with HTTP
    409, (c) BYO mode does NOT consult PDM.

Architectural compliance:

  - Principle #4 (never hardcode): every URL/domain/region is runtime
    configuration. POOL_DOMAIN_MANAGER_URL has a sane default so the
    common case 'just works' but is overridable for air-gap installs.
  - Principle #2 (no quality compromise): the PDM lifecycle is the
    target-state shape. Reserve before tofu apply guarantees a name
    can't be double-allocated by a parallel wizard tab. Commit AFTER
    tofu apply guarantees we don't write DNS for a Sovereign that
    doesn't exist yet.
  - Lesson #24 (don't bypass off-the-shelf primitives): the catalyst-api
    no longer carries its own copy of the reserved-name list, no longer
    calls Dynadot directly for the lifecycle, and no longer does DNS-
    based availability checks for managed pools. PDM IS the off-the-
    shelf primitive for this concern; we use it.

Refs: #163
2026-04-29 06:44:22 +02:00
hatiyildiz
585b046f5d feat(pdm): pool-domain-manager service skeleton (Phase 1 of #163)
Build a new Go service core/pool-domain-manager that becomes the SOLE
authority for OpenOva-pool subdomain allocation across the fleet.

Why this exists: today products/catalyst/bootstrap/api/internal/handler/
subdomains.go does naive net.LookupHost() to decide whether a candidate
subdomain is taken. Dynadot's wildcard parking record at the apex of
omani.works (and any future pool domain) makes EVERY subdomain resolve
to 185.53.179.128, so the check rejects everything. DNS is the wrong
source of truth for an OpenOva-managed pool — the central control plane
must own the allocation table.

What this commit adds (no integration with catalyst-api yet — that lands
in a follow-up commit):

  core/pool-domain-manager/
    cmd/pdm/main.go                     chi router, healthz, sweeper boot
    api/openapi.yaml                     wire contract for every endpoint
    Containerfile                        alpine final stage, UID 65534
    internal/store/                      pgx + CNPG; pool_allocations table
      migrations.sql                       idempotent CREATE TABLE schema
      store.go                             Reserve/Get/Commit/Release/List
      store_test.go                        integration tests (PDM_TEST_DSN)
    internal/dynadot/                    moved + extended; SOLE Dynadot caller
      dynadot.go                           AddRecord, AddSovereignRecords,
                                           DeleteSubdomainRecords (read-modify-
                                           write to honour feedback_dynadot_dns)
      dynadot_test.go                      managed-domain resolution tests
    internal/reserved/                   centralised reserved-name list
      reserved.go                          IsReserved/All; pulled out of
                                           catalyst-api's subdomains.go
    internal/handler/                    HTTP surface
      handler.go                           /api/v1/pool/{domain}/{check,reserve,
                                           commit,release,list}, /healthz,
                                           /api/v1/reserved
    internal/allocator/                  state machine + sweeper goroutine

Architecture choices and how they map to docs/INVIOLABLE-PRINCIPLES.md:

  - Principle #4 (never hardcode): every value (PORT, PDM_DATABASE_URL,
    DYNADOT_MANAGED_DOMAINS, PDM_RESERVATION_TTL, PDM_SWEEPER_INTERVAL)
    flows from env vars; the K8s ExternalSecret will populate them at
    deploy time. The reserved-subdomain list lives in ONE place
    (internal/reserved); catalyst-api will not duplicate it.

  - Principle #2 (no quality compromise): the state machine commits the
    DB row before the Dynadot side-effect, so a crash between the two
    leaves the system in a recoverable state (operator runs Release).
    The reservation_token in the row protects against stale-tab commit
    races. UPSERT semantics + a CHECK constraint mean two operators
    racing /reserve get a clean 23505 (unique_violation) → HTTP 409.

  - Principle #3 (follow architecture): PDM is a ClusterIP service in
    openova-system — it is not a Crossplane provider, not a Flux
    HelmRelease, not bespoke OpenTofu state. catalyst-api speaks to it
    via plain HTTP. The Crossplane Composition that wraps PDM as a
    declarative MR (XDynadotPoolAllocation) lands in a follow-up phase.

The DNS-wildcard problem the issue describes is fixed STRUCTURALLY here:
PDM never calls net.LookupHost. The /check path is a single SELECT
against pool_allocations. omani.works's wildcard A record at the apex
becomes architecturally irrelevant.

Tests exercised in this commit:
  - internal/reserved: full unit coverage (case-insensitive, sorted, set
    membership)
  - internal/dynadot: managed-domain runtime resolution (env-var,
    legacy single-domain fallback, built-in defaults, list parsing)
  - internal/store: integration suite gated on PDM_TEST_DSN env var,
    covers reserve happy-path, reserve race (ErrConflict), TTL expiry
    frees, commit happy-path, commit token mismatch, release removes
    row, sweeper deletes expired rows

Closes phase 1 of #163. Phase 2 (catalyst-api wiring), Phase 3 (CI +
manifests), Phase 4 (Crossplane composition), Phase 6 (deploy +
verification curl) follow in separate commits.

Refs: #163
2026-04-29 06:37:38 +02:00
github-actions[bot]
296fd68819 deploy: update catalyst images to 16d837b 2026-04-29 04:36:47 +00:00
hatiyildiz
16d837bb81 merge: #162 — wizard StepComponents UX polish (cards + logos + tabs + brand mark)
Closes operator-facing UX gaps from issue #162:

- Phase 1: card surface pixel-matches SME marketplace AppsStep.svelte
  (height 108px, hover-reveal toggle, mask-gradient chips, etc.)
- Phase 2: 63 vendored component-logo SVGs under /component-logos/
  with logoUrl field on ComponentDef defaulting to data-driven path.
  docs/COMPONENT-LOGOS.md tracks each upstream source + licence.
- Phase 3: two-tab segregation. 'Choose Your Stack' (recommended +
  optional, search + filter + cascade-aware toggle) and
  'Always Included' (mandatory only, grouped by product, read-only,
  INFRASTRUCTURE pill). Counts derived from componentGroups, never
  hardcoded.
- Phase 4: OpenOva brand mark in wizard top bar at 32px, with
  /openova-logo.svg vendored under public/ for non-React surfaces.

Build: typecheck clean, 81 vitest tests passing (up from 65), prod
build successful with 63 logos + brand mark in dist/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:34:23 +02:00
hatiyildiz
8ee52e6500 feat(wizard): #162 OpenOva brand mark on wizard top bar
Phase 4 of issue #162.

The wizard's top bar already rendered the OpenOva infinity-loop mark
via <OOLogo h={22} />. This bumps it to the spec'd 32px height and
adds a `data-testid="wizard-logo"` hook for end-to-end tests.

Also vendors the canonical brand SVG to
`products/catalyst/bootstrap/ui/public/openova-logo.svg` (sourced from
the marketing repo's logo-icon.svg). Static pages bundled with the
wizard (e.g. provision.html) and any future non-React surface can now
reference `/openova-logo.svg` directly without duplicating the path
data — single source of truth for the brand mark.

The link target is unchanged: `/app/dashboard` for SaaS, `/` for
self-hosted (which redirects to the wizard root). Both effectively
land back at the wizard home, matching the issue's "link to wizard
home" requirement.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:32:56 +02:00
hatiyildiz
c3b36cb170 feat(wizard): #162 SME-pixel-perfect cards + two-tab segregation
Phases 1 + 3 of issue #162.

Phase 1 — pixel-match the SME marketplace card surface:
  - Card height fixed at 108px, overflow hidden, 12px radius
  - Logo column align-self: stretch + aspect-ratio 1/1 → fills card height
  - Body padding-right: 4.5rem reserves room for top-right toggle
    button + bottom-right SELECTED pill
  - Toggle button hidden by default, opacity 0 → 1 + scale(0.8) → 1
    on hover (matches AppsStep.svelte .app-add-btn transition)
  - Hover: translateY(-2px) + accent border + 0 4px 16px shadow
  - Selected: green border + green-tinted bg
  - Chips use mask-image gradient on right edge
  - Toolbar split into rows: search row + chip row
  - Toast slide-in keyframe matches SME exactly
  - All visual rules consolidated into one <style> block via .corp-comp-*
    classes so the card surface has a single source of truth, replacing
    the previous inline-style sprinkle.

Phase 3 — two-tab segregation (operator preference: Option A):
  - Top of step renders two tabs:
      "Choose Your Stack (N)"  — N = recommended + optional currently
                                 selected (computed from store)
      "Always Included (M)"    — M = catalog mandatory count (computed
                                 from componentGroups, never hardcoded)
  - Tab 1 body: only non-mandatory components, search + category chip
    filter, sort-selected-first, all current cascade-add/remove logic
    intact. Mandatory cards do NOT appear in this tab — they were
    confusing as "locked toggles" alongside selectable components.
  - Tab 2 body: only mandatory components, grouped by product
    (PILOT, GUARDIAN, …). No search, no category chips, no toggle UI.
    "INFRASTRUCTURE" pill replaces the MANDATORY pill so users read it
    as platform infra rather than a wizard option. Subdued text colors
    (var(--wiz-text-md)) for body so the section reads informational.
  - Tab 2 header carries the foundational-platform blurb verbatim:
    "These platform components run on every Sovereign. They're
    foundational — you don't pay extra for them."
  - Tab switch is local state — selection store untouched, so
    Continue's dependency-consistency validation continues to read the
    full selectedComponents set unchanged.

Cards in Tab 1 now load `<img src={entry.logoUrl}>` against the
vendored SVGs from Phase 2 (commit 979ff59). The IconFallback letter
mark stays as a defensive fallback when logoUrl is null.

Vitest coverage rewritten to match the two-tab contract:
  - Tabs render + switch state + counters
  - Tab 1 only shows non-mandatory cards
  - Tab 2 only shows mandatory cards, grouped by product
  - Tab 2 has no search, no chips, no toggle
  - Logo URLs default to /component-logos/<id>.svg
Total: 81 tests passing (up from 65).

Per docs/INVIOLABLE-PRINCIPLES.md #4 — every count, tier, label is
derived from componentGroups.ts. The "Always Included (28)" example in
the issue body is replaced by a runtime-computed count so the badge
stays correct as the catalog evolves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:32:45 +02:00
hatiyildiz
979ff59332 feat(wizard): vendor 63 component logos + add ComponentDef.logoUrl
Phase 2 of issue #162. Each component now resolves to a real SVG mark
under products/catalyst/bootstrap/ui/public/component-logos/<id>.svg
instead of the letter-pill fallback. ComponentDef gains an optional
logoUrl field that defaults to /component-logos/<id>.svg per id, so
swapping a file under public/ rebrands the card without touching
application source (per INVIOLABLE-PRINCIPLES.md #4 "never hardcode").

The 63 SVGs are stylised brand-color marks, not copies of the upstream
projects' trademarked logotypes — this avoids licence ambiguity in a
public repo while still giving the wizard a visually distinctive
component grid. docs/COMPONENT-LOGOS.md tracks the canonical upstream
source for each component (CNCF artwork repo, project brand pages,
etc.) so the asset library can be audited and individual files swapped
for official art when permission/license is verified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 06:29:23 +02:00
github-actions[bot]
4518ad67d8 deploy: update catalyst images to 3a0ffb0 2026-04-28 21:15:11 +00:00
hatiyildiz
3a0ffb0006 merge: #161 — corporate platform component grid in wizard StepComponents
Brings the SME marketplace UX pattern (search + category chips +
flat sort-selected-first card grid) to the corporate Sovereign-bootstrap
wizard's StepComponents page, backed by the 60+ platform component
catalog in componentGroups.ts with dependency-aware cascading
selection.

  • componentGroups.ts — adds dependencies graph (Harbor → cnpg+seaweedfs
    +valkey, Keycloak/Gitea → cnpg, Velero/Loki/Mimir/Tempo → seaweedfs,
    External Secrets → openbao, cert-manager/k8gb → external-dns, …)
    plus catalog helpers (findComponent, resolveTransitiveDependencies,
    resolveTransitiveDependents, computeDefaultSelection).

  • StepComponents.tsx — flat grid mirroring AppsStep.svelte. Tier
    badges (MANDATORY/RECOMMENDED/OPTIONAL), "Includes:" hints, toast
    notifications, cascade-aware confirm modal on destructive removes.

  • Wizard store — selectedComponents repurposed as sorted, deduped
    string[], with addComponent/removeComponent walking the dependency
    graph; mandatory ids cannot be removed; persist.merge() seeds
    defaults on first run and force-keeps mandatory ids.

  • 38 new vitest tests covering catalog sanity, search, category
    filter, sort, mandatory rejection, cascading add/remove, store
    invariants, reverse-graph helpers (54 total in StepComponents.test).

  • catalog.generated.ts untouched — StepProvisioning timeline still
    reads the same 10 platform-infra blueprint entries for SSE phase
    labelling.

Refs: GitHub issue #161
2026-04-28 23:13:00 +02:00
hatiyildiz
e8b095db34 test(wizard-step5): #161 phase 4 — vitest coverage for component grid
38 new tests across 9 describe blocks covering every behaviour the
corporate component grid promises:

  catalog sanity (6)
    - 60+ components, every tier valid, every dep edge points at a real
      catalog id, Harbor → cnpg+seaweedfs+valkey, OpenSearch has no deps,
      and Reloader / KEDA / VPA / Cilium / Crossplane / Flux are dep-free.

  card grid (4)
    - one card rendered per catalog entry, "Selected (N)" counter live,
      "Includes:" hint visible for components with deps, MANDATORY tier
      badge shown for mandatory cards.

  search filter (3)
    - narrows by name, by group name, shows empty state on no match.

  category filter (3)
    - 9 group chips + "All", chip click narrows the grid, second click
      clears the filter.

  sort: selected first (1)
    - newly-selected components float to the top of their group.

  mandatory cards (2)
    - clicking them never deselects, emits a "mandatory" toast.

  cascading add (4)
    - addComponent('milvus') pulls in seaweedfs, addComponent('harbor')
      pulls in cnpg+seaweedfs+valkey, the UI emits a single toast naming
      every cascaded dep, action is idempotent.

  cascading remove (4)
    - confirm dialog opens with the impact set listed, cancel keeps the
      component selected, confirm cascades through the entire impact
      set, mandatory ids stay even when their dep is removed.

  store invariants (5)
    - selectedComponents always sorted, de-duplicated by
      setSelectedComponents, legacy SelectedComponent[] is normalised to
      ids by setComponents, resetSelectedComponentsToDefault restores
      mandatory + recommended + their deps, every mandatory id is in the
      default selection along with its transitive deps.

  reset to defaults (1) + reverse-graph helpers (3)

Test results: 54 passed, 0 failed (37 new + 17 from StepSuccess.test.tsx).
typecheck clean, build succeeds (10 platform-infra blueprints in
catalog.generated.ts as expected).

Refs: GitHub issue #161

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:11:54 +02:00
hatiyildiz
554c914161 feat(wizard-step5): #161 phase 2+3 — corporate platform component grid
Rewrite the wizard's StepComponents page so it renders the SME marketplace
UX pattern with the corporate platform catalog (60+ components from
componentGroups.ts) instead of the bp-<x> Blueprint card grid which is
empty today (no published Blueprint has visibility:listed).

UX (mirrors core/marketplace/AppsStep.svelte):
  - Top: search input + 9 product-group category chips (PILOT, SPINE,
    SURGE, SILO, GUARDIAN, INSIGHTS, FABRIC, CORTEX, RELAY) + "All".
  - Body: flat 3-column card grid sorted selected-first then alphabetical.
  - Each card: name + group pill + description + tier badge
    (MANDATORY locked / RECOMMENDED / OPTIONAL) + "Includes: X, Y, Z" hint
    when the component has dependencies + circle add/remove button.
  - Selected counter top-right: "Selected (N) of M" with progress bar.

Dependency-aware selection (cascade graph from componentGroups.ts):
  - Adding Harbor cascades → cnpg + seaweedfs + valkey are auto-added,
    one toast announces all three.
  - Removing cnpg shows a confirm modal listing every dependent that will
    also be removed (Harbor, Keycloak, Gitea, OpenMeter, Temporal, …).
    User clicks "Remove all" or "Keep" — no silent destruction.
  - Mandatory cards are locked: clicking emits a one-shot toast
    "X is mandatory" and the toggle is a no-op.

Store (entities/deployment/store.ts):
  - selectedComponents: string[] — new sorted, de-duplicated id list,
    persisted via Zustand. Replaces the previously-unused
    SelectedComponent[] shape; legacy toggleComponent / setComponents
    still accept the old record form and normalise to ids on the way in.
  - addComponent(id) walks resolveTransitiveDependencies and adds every
    reachable component, idempotent.
  - removeComponent(id) walks resolveTransitiveDependents (reverse
    edges) and removes the impact set; mandatory components are skipped.
    UI is responsible for confirming with the user before calling.
  - resetSelectedComponentsToDefault() — restores the
    "every mandatory + recommended + their deps" baseline.
  - persist.merge() seeds selectedComponents with the default selection
    on first run, drops orphan ids, and force-includes every mandatory
    component on re-hydration.

Model (entities/deployment/model.ts):
  - INITIAL_WIZARD_STATE.selectedComponents = computeDefaultSelection()
    so the wizard opens with mandatory + recommended already on. The
    legacy SelectedComponent record type is retained for back-compat.

Per docs/INVIOLABLE-PRINCIPLES.md #4 every dependency edge, tier and
label is read from componentGroups.ts — no hardcoded mappings in the
component or store.

Refs: GitHub issue #161

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:11:54 +02:00
hatiyildiz
9d852dceab feat(wizard-step5): #161 phase 1 — add dependency graph to componentGroups
Extend ComponentDef with `dependencies: string[]` so the corporate wizard
StepComponents grid can cascade-add and cascade-remove platform components
based on real-world platform-engineering knowledge. Mandatory tier still
locks components on; recommended/optional cards now know their pre-reqs.

Dependency-aware mappings (representative):
  Harbor              → cnpg, seaweedfs, valkey
  Keycloak / Gitea    → cnpg
  Velero / Iceberg    → seaweedfs
  Grafana / Loki /
  Mimir / Tempo       → seaweedfs
  External Secrets    → openbao
  cert-manager / k8gb → external-dns
  FerretDB / OpenMeter
  / Temporal / LangFuse
  / LibreChat / Matrix
  / Superset          → cnpg
  Milvus              → seaweedfs
  Debezium            → strimzi
  OpenSearch          → none (own storage)
  OpenBao / KEDA / VPA
  / Reloader / Cilium
  / Crossplane / Flux → none

Adds catalog helpers — findComponent, resolveTransitiveDependencies,
findDependents, resolveTransitiveDependents, isMandatory,
computeDefaultSelection — that the wizard store and StepComponents.tsx
will consume in phase 2 to power the cascade-aware selection model.

Per docs/INVIOLABLE-PRINCIPLES.md #4 the dependency table is the single
source of truth — no app-side knowledge of which component implies which.

Refs: GitHub issue #161

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:11:12 +02:00
github-actions[bot]
678f5dc604 deploy: update catalyst images to 1036f68 2026-04-28 21:03:27 +00:00
Emrah Baysal
1036f68522 merge: #160 — SSH keypair UX in wizard (auto-generate + paste-existing)
Brings in feat/wizard-ssh-key-ux:
  • POST /api/v1/sshkey/generate (Ed25519 + OpenSSH wire format,
    fingerprint-only logging, no on-disk persistence)
  • StepCredentials two-mode SSH section (Mode A generate / Mode B paste)
    with one-time private-key download + warning banner
  • Wizard store: sshPublicKey + private-blob held in memory only,
    stripped from localStorage by partialize()
  • StepReview now wires store.sshPublicKey into the deployment payload —
    fixes the previous TODO that submitted an empty key
  • RFC algorithm allow-list mirrors infra/hetzner/variables.tf regex
  • UI tests: 27 vitest tests pass (typecheck + build clean)
  • Go tests: sshkey_test.go covers PEM/wire-format/fingerprint shape

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:01:01 +02:00
hatiyildiz
e7caa0696f feat(catalyst-wizard): SSH keypair UX in StepCredentials — auto-generate + paste-existing
Closes #160 ([I] ux: SSH keypair UX in wizard).

Backend (Go):
  - Add POST /api/v1/sshkey/generate handler at
    products/catalyst/bootstrap/api/internal/handler/sshkey.go.
  - Generates an Ed25519 keypair via crypto/ed25519 + rand.Reader,
    encodes the public half to OpenSSH authorized_keys wire format
    and the private half to PEM-armoured openssh-key-v1 (no passphrase),
    returns SHA256 fingerprint matching `ssh-keygen -lf`.
  - Logs ONLY the fingerprint per credential-hygiene principle #10 —
    private key never written to disk; comment derived from caller-
    supplied FQDN, never hardcoded.
  - Wire into chi router in cmd/api/main.go.
  - sshkey_test.go covers response shape, authorized_keys format, PEM
    decode + openssh-key-v1 magic header, fingerprint length/format,
    two-call uniqueness, default comment fallback.

Frontend (React + Zustand):
  - Extend StepCredentials with an SSHKeySection — two-mode UX:
      Mode A (Generate keypair) — POST /api/v1/sshkey/generate, capture
        public key + fingerprint into store, trigger Blob-URL download
        of the private key as `<fqdn-or-catalyst>.pem`, show one-time
        warning banner ("Private key shown once. Save it now or you
        lose access.") with re-download + re-generate buttons.
      Mode B (Paste existing public key) — textarea, RFC validation
        regex matching infra/hetzner/variables.tf (ssh-ed25519 / ssh-rsa
        / ecdsa-sha2-nistp256/384/521), inline error on malformed input.
  - Wizard's Continue button is now gated on isValidSSHPublicKey(store.sshPublicKey).
  - Wire store.sshPublicKey into the StepReview deployment payload —
    replaces the previous `sshPublicKey: ''` TODO.
  - Store extension: sshPublicKey, sshKeyGeneratedThisSession,
    sshPrivateKeyOnce, sshFingerprint + setSshPublicKey,
    setSshGenerated, clearSshPrivateKey actions; partialize() strips
    the private blob + session flag from localStorage so a fresh
    tab always re-prompts (credential hygiene #10).
  - Vitest (StepCredentials.test.tsx) covers both modes:
    request shape, store population, download trigger (URL.createObjectURL
    + anchor.click spies), one-time-warning render, HTTP-500 path leaves
    store empty, paste validation accepts/rejects per algorithm whitelist.

OpenTofu integration:
  - provisioner.Request.SSHPublicKey was already declared from group J;
    StepReview now feeds it the captured public half so `tofu apply`
    receives a non-empty value and the variables.tf regex validator
    accepts the run.

Tests:
  - npm run typecheck PASS (zero errors).
  - npm run test PASS (27/27 tests across 2 files).
  - npm run build PASS (vite production bundle 862 kB).
  - Go unit tests run in CI (no Go toolchain on the build host).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 23:00:37 +02:00
hatiyildiz
2ee2e8d24d fix(catalyst-platform): unblock cutover Kustomization — revert Helm templating
919514c added Helm template expressions (`{{ .Values.* }}`) into
products/catalyst/chart/templates/ingress.yaml + ui-deployment.yaml +
ui-configmap.yaml + values.yaml. These files are consumed by the
catalyst-platform Flux Kustomization on Catalyst-Zero (Contabo), which
goes through kustomize-controller — not helm-controller — so the
template expressions are NOT rendered.

Failure observed in production:
  catalyst-platform kustomize build failed: updating name reference in
  spec/ingressClassName field of Ingress.networking/console-sovereign:
  path config error; no name field in node

The ingressClassName template expression broke kustomize's name-reference
resolver. The ConfigMap with Helm expressions in nginx config strings
would have left nginx unable to resolve upstreams at runtime.

Surgical revert:
- ingress.yaml, ui-deployment.yaml: back to pre-919514c plain YAML
- ui-configmap.yaml, values.yaml: deleted (had no plain-YAML predecessor)

The values-driven /sovereign nginx routing remains the right target
state — but the path forward is to convert catalyst-platform to a Flux
HelmRelease (helm-controller renders templates), not to mix Helm
templates into a kustomize-applied directory. Tracking ticket follows.
2026-04-28 22:48:02 +02:00
hatiyildiz
2323e74048 merge: Group L — Playwright UI smoke tests (#142, #143, #144) 2026-04-28 19:54:28 +02:00
hatiyildiz
55b8a18b32 test(e2e): #142, #143, #144 — Playwright UI smoke tests for sovereign wizard, admin vouchers, marketplace bp-<x> grid
Group L closes the three UI smoke-test gaps the verify-sweep flagged:

  #142 sovereign wizard       — tests/e2e/playwright/tests/sovereign-wizard.spec.ts
  #143 admin voucher UI       — tests/e2e/playwright/tests/admin-vouchers.spec.ts
  #144 unified bp-<x> grid    — tests/e2e/playwright/tests/marketplace-cards.spec.ts

Tests target the actual shipped UI shape (Pass 105+):

* Wizard step model is StepOrg → StepTopology → StepProvider →
  StepCredentials → StepComponents → StepReview, not the original ticket's
  StepDomain/StepHetzner draft from before the unified-Blueprints refactor.
* Admin voucher model uses an `active` toggle, not ISSUED/REVOKED status.
* "Marketplace card grid" = the Catalyst wizard's StepComponents (bp-<x>
  Blueprints), NOT the SME marketplace at core/marketplace (which is for
  SaaS Apps). Today every Blueprint is `visibility: unlisted`, so the test
  asserts the data layer (catalog.generated.ts) plus the documented
  EmptyState; once `visibility: listed` lands, the third assertion
  auto-extends to the rendered card grid.

Per principle #4 ("never hardcode"), all URLs come from env vars with
sensible local-dev defaults. Per principle #1 ("never speculate"), tests
self-skip with explicit reasons when their target app isn't reachable
instead of fail-noisy.

CI: .github/workflows/playwright-smoke.yaml boots the Catalyst UI in the
background and runs the suite on PRs touching UI sources or tests; admin
and marketplace specs self-skip in that workflow because spinning up all
three Astro apps + catalyst-api + Postgres is the full E2E pipeline's
job, not this smoke.

Local run (Catalyst UI on :4399, admin on :4398): 5 passed, 2 skipped
(skip reasons: marketplace #3 needs StepComponents reachable past
required-field gating; admin #2 needs ADMIN_TEST_COOKIE for an
authenticated session).

Refs: #142, #143, #144
2026-04-28 19:54:04 +02:00
hatiyildiz
919514ca78 merge: /sovereign nginx routing — values-driven /sovereign + /api/v1 (a35da92) 2026-04-28 19:50:39 +02:00
hatiyildiz
a35da929f1 feat(sovereign-route): values-driven /sovereign + /api/v1 routing
Per docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode), the catalyst-ui
nginx config now flows from values.yaml at chart-render time:

- routing.basePath (/sovereign) — also drives ingress strip-prefix
- routing.catalystApi.serviceDNS — in-cluster reverse-proxy target
- routing.catalystApi.port — upstream port
- dns.resolverIP — CoreDNS for proxy-time resolution (avoids stale
  ClusterIP after catalyst-api restarts)
- ingress.host / ingress.priority / ingress.className

Files:
- products/catalyst/chart/values.yaml — new, documents every default
- products/catalyst/chart/templates/ui-configmap.yaml — new, nginx
  reverse-proxies /api/* to catalyst-api Service DNS
- products/catalyst/chart/templates/ui-deployment.yaml — mounts the
  ConfigMap at /etc/nginx/conf.d/default.conf
- products/catalyst/chart/templates/ingress.yaml — values-driven host
  + path + priority + class
- tests/e2e/sovereign-routing/* — Playwright smoke for the routing

Captured from stalled agent /tmp/agent-sovereign-route-finish — agent
stream watchdog timed out after the work was authored but before commit.
2026-04-28 19:48:40 +02:00
hatiyildiz
8886eff708 Merge branch 'feat/group-g-dns-finish-v3'
Group G DNS finish (v3): #110 (Dynadot multi-domain table-driven tests),
#112 (catalyst-dns httptest-mocked Dynadot coverage), #113 (cert-manager
LE DNS-01 + HTTP-01 ClusterIssuer templates with operator runbook for
the cert-manager-dynadot-webhook gap).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:45:35 +02:00
hatiyildiz
dd8b16f0c5 merge: feat/group-i-success-state-126-v2 — Group I StepSuccess (#126)
Adds wizard StepSuccess terminal step with sovereign console URL,
first-time admin login flow, kubeconfig download, voucher CTA, SSE log
tail, and docs link. All URLs derived from wizard state — never
hardcoded. 16 / 16 vitest tests green; tsc -b --noEmit clean.

Closes #126.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:44:59 +02:00
hatiyildiz
97e942e0bc feat(cert-manager): #113 — Lets Encrypt DNS-01 + HTTP-01 ClusterIssuers
Adds platform/cert-manager/chart/templates/clusterissuer-letsencrypt-dns01.yaml
with two ClusterIssuers, both Catalyst-curated, rendered conditionally
from values.yaml:

- letsencrypt-dns01-prod (TARGET STATE, default disabled) — ACME DNS-01
  via the cert-manager webhook solver, pointing at a future
  `cert-manager-dynadot-webhook` Catalyst binary that will implement the
  webhook.acme.cert-manager.io/v1alpha1 contract against the existing
  internal/dynadot/ package. Shipping the issuer template ahead of the
  webhook so cluster overlays only need a values flip + secret ref —
  no template edits — once the webhook lands.

- letsencrypt-http01-prod (INTERIM, default enabled) — ACME HTTP-01
  via the cilium ingress class. Issues certs for the explicit hostnames
  (console, gitea, harbor, admin, api) but NOT for wildcards; the
  canonical *.<sub>.<domain> record needs DNS-01.

Header comment explains the gap: the Catalyst external-dns webhook
(products/catalyst/bootstrap/api/cmd/external-dns-dynadot-webhook/)
implements a DIFFERENT RPC contract (records.list/add/delete) than what
cert-manager DNS-01 expects (Present/CleanUp on ChallengeRequest CRD),
so it cannot be reused; a dedicated cmd/cert-manager-dynadot-webhook/
must be built. Operator runbook for cutover is in the file header.

values.yaml gains a `certManager.issuers.{email,acmeServer,dns01,http01}`
section so all knobs are runtime-configurable per
docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode); cluster overlays in
clusters/<sovereign>/ can flip dns01.enabled via the bp-catalyst-platform
umbrella's values without rebuilding the Blueprint OCI artifact.

blueprint.yaml gains a spec.outputs section advertising:
- issuerName: letsencrypt-http01-prod (default)
- wildcardIssuerName: letsencrypt-dns01-prod (target state)
- issuerKind: ClusterIssuer

so dependent Blueprints (cilium-gateway, harbor, gitea) can consume the
issuer name without hardcoding it.

Closes #113.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:44:56 +02:00
hatiyildiz
7af848e2bd feat(catalyst-bootstrap): #126 add wizard StepSuccess terminal state
Group I — closes the success-state UX gap after the 11-phase
bootstrap kit finishes green:

  - Primary CTA opens https://console.<sovereign-fqdn>/ — domain
    derived from wizard state (resolveSovereignDomain) or from the
    catalyst-api `done` event payload (lastProvisionResult.consoleURL).
    No hardcoded URLs (Inviolable-Principle #4).
  - First-time admin login: username = admin@<sovereign-fqdn>; the
    "Mint one-time login URL" button calls
    GET /api/v1/deployments/<id>/admin-login-url and falls back to a
    documented Keycloak realm-master + reset-password flow when the
    endpoint returns 404/501 (RUNBOOK-PROVISIONING.md §First login).
  - kubeconfig download fetches /api/v1/deployments/<id>/kubeconfig,
    falls back to "Coming soon — fetch via SSH" + runbook link when
    the endpoint isn't implemented.
  - Voucher-issuance shortcut (secondary CTA) →
    https://admin.<sovereign-fqdn>/billing/vouchers/new
  - SSE final-state log tail (last 20 lines) collapsed/expandable.
  - Sovereign /docs link as second tile next to voucher CTA.

Wires StepSuccess as the 7th step in WizardPage.STEPS so the wizard's
existing currentStep navigation can land on it once provisioning
completes (lastProvisionResult populated by StepProvisioning's `done`
SSE event handler — to be wired in a separate ticket).

Test coverage (vitest + @testing-library/react, 16 cases): every CTA's
href is asserted against a fixture FQDN, including a BYO domain switch
to prove no hardcoded hostname leaks. Adds devDeps vitest, jsdom,
@testing-library/react, @testing-library/jest-dom, plus npm scripts
test/test:watch/typecheck.

Files:
  products/catalyst/bootstrap/ui/src/pages/wizard/steps/StepSuccess.tsx (new)
  products/catalyst/bootstrap/ui/src/pages/wizard/steps/StepSuccess.test.tsx (new)
  products/catalyst/bootstrap/ui/src/pages/wizard/WizardPage.tsx
  products/catalyst/bootstrap/ui/vite.config.ts (vitest config)
  products/catalyst/bootstrap/ui/package.json (test scripts + devDeps)

Verification:
  npm run typecheck  → green
  npm run test       → 16 / 16 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:44:31 +02:00
hatiyildiz
77a3014f74 fix(workflow): blueprint-release supports products/ tree on workflow_dispatch
Adds a `tree` input (default `platform`) so manual triggers can build
umbrella charts under products/ — e.g.
  gh workflow run blueprint-release.yaml -f blueprint=catalyst -f tree=products
will dispatch a build of products/catalyst/chart.

Push-triggered builds already detect both platform/* and products/* via
the diff filter; this only fixes the workflow_dispatch path which was
hardcoded to platform/.
2026-04-28 19:43:47 +02:00
hatiyildiz
8643b0fb9e Merge branch 'feat/bp-external-dns-leaf-chart'
Authors the bp-external-dns leaf chart so the umbrella bp-catalyst-platform's
dependency block (11 leaves) resolves — closes the Group F gap that surfaced
in workflow run 25068433765.
2026-04-28 19:42:30 +02:00
hatiyildiz
c07e0ad1ee feat(external-dns): #109 — author bp-external-dns leaf chart for OCI publish
The bp-catalyst-platform umbrella (issue #104) declares a dependency on
bp-external-dns:1.0.0 — but the chart didn't exist; only README + Dynadot
multi-domain policy lived under platform/external-dns/. Without this leaf
the umbrella's `helm dependency build` fails (verified in run 25068433765).

This commit authors the minimal target-state leaf:
- Chart.yaml: name=bp-external-dns, version=1.0.0
- values.yaml: catalystBlueprint.upstream metadata (external-dns 1.15.0
  from kubernetes-sigs/external-dns Helm repo) + Catalyst-curated values
  overlay (sources, txtOwnerId, ServiceMonitor, RBAC, resources)

Per BLUEPRINT-AUTHORING.md §3, leaf charts are pure values-overlay wrappers:
no templates dir, just Chart.yaml + values.yaml with the catalystBlueprint
metadata block read by the bootstrap-kit installer at helm-install time.

Per-Sovereign provider/zone/credential overrides are overlaid by the
Crossplane Composition that materializes the HelmRelease — keeping this
chart provider-agnostic (no hardcoded Cloudflare/Dynadot/Hetzner choice
per INVIOLABLE-PRINCIPLES.md §4).

After this lands, blueprint-release.yaml will publish
ghcr.io/openova-io/bp-external-dns:1.0.0 and the next umbrella push will
resolve all 11 leaf deps successfully.
2026-04-28 19:42:23 +02:00
hatiyildiz
dc3a2b306e test(catalyst-dns): #112 — provisioning DNS write coverage with mocked Dynadot
Refactors catalyst-dns/main.go to expose a testable run() core (validate +
AddSovereignRecords loop) so the binary can be exercised against an
httptest.Server without touching api.dynadot.com.

Adds main_test.go with five scenarios:

- TestRun_WritesSixCanonicalARecords — the headline assertion: a single
  invocation produces exactly six POSTs against the mocked Dynadot
  endpoint, one per canonical subdomain (*.<sub>, console, gitea, harbor,
  admin, api), all A records pointing at the LB IP, all carrying
  add_dns_to_current_setting=yes.
- TestRun_NeverWipesZone — strict regression guard for the cardinal
  rule from feedback_dynadot_dns.md (a single missing flag wipes the
  zone). Asserted on every iteration of the loop.
- TestRun_ValidationErrors — table-driven coverage of every input
  contract failure (missing key/secret/domain/subdomain/lb-ip,
  unmanaged-domain rejection); zero Dynadot calls happen on validation
  failure so the OpenTofu module gets a deterministic fast-fail.
- TestRun_FailsFastOnDynadotError — when Dynadot rejects the first
  record, run() returns immediately rather than leaving a partial zone.
- TestRun_NeverHitsRealDynadot — paranoia guard proving the rewrite
  transport is in place; a guarded transport refuses any non-loopback
  host so a regression in the rewrite would surface immediately.
- TestReadInputsFromEnv — env-var contract coverage.

Per docs/INVIOLABLE-PRINCIPLES.md #2 (no compromise on quality), the HTTP
client, URL encoding, and JSON parsing are real package code paths;
only the upstream Dynadot endpoint is substituted with httptest.Server.
Hitting the real api.dynadot.com would write real records and burn real
quota every CI run, which is exactly the failure the never-mock
principle is designed to prevent in this case.

Closes #112.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:41:31 +02:00
hatiyildiz
a1673af401 Merge branch 'feat/group-f-umbrella-chart-fix-v2'
Group F: bp-catalyst-platform umbrella chart (#104) + 11th OCI artifact (#107).
Renames products/catalyst/chart from `catalyst-platform` to `bp-catalyst-platform`,
bumps to 1.0.1, declares dependencies on the 11 leaf Blueprints.
Workflow blueprint-release.yaml now reads chart name from Chart.yaml instead of
deriving from folder basename, and adds helm registry login for OCI deps.

Disclosed in commit 497643a: bp-external-dns:1.0.0 dep is declared but not
yet published — gates on issue #109.
2026-04-28 19:40:20 +02:00
hatiyildiz
497643a4bf fix(catalyst): #104 #107 — bp-catalyst-platform umbrella chart with 11 leaf deps
Issue #104: products/catalyst/chart/Chart.yaml had `name: catalyst-platform`
(missing the `bp-` prefix required by BLUEPRINT-AUTHORING.md §3) and no
`dependencies:` block. The Catalyst umbrella must depend on the 11 bootstrap-kit
leaf Blueprints so a single Flux HelmRelease at the umbrella OCI ref pulls in
the full Catalyst-Zero control plane.

Issue #107: bp-catalyst-platform was the missing 11th OCI artifact at
ghcr.io/openova-io. With this fix, blueprint-release.yaml will publish
ghcr.io/openova-io/bp-catalyst-platform:1.0.1 on push.

Changes:
- Rename chart to `bp-catalyst-platform`, bump version 1.0.0 -> 1.0.1
- Add `dependencies:` block listing all 11 leaves
  (cilium, cert-manager, flux, crossplane, sealed-secrets, spire,
   nats-jetstream, openbao, keycloak, gitea, external-dns), each
  pinned to 1.0.0 at oci://ghcr.io/openova-io
- Workflow blueprint-release.yaml: read chart name from Chart.yaml `name:`
  field instead of deriving `bp-<basename>` from the folder. The umbrella
  folder is `catalyst` but the chart name is `bp-catalyst-platform` —
  basename-derivation is wrong for any chart whose name doesn't equal
  `bp-<folder>`. Removes the implicit `bp-` prefix in the push step;
  Chart.yaml carries the full canonical name.
- Workflow: add `helm registry login ghcr.io` step before `helm dependency
  build` so OCI-hosted leaf deps resolve. The pre-existing docker login
  is for cosign/syft only; helm has its own auth store.

Disclosure (per INVIOLABLE-PRINCIPLES.md §8):
- bp-external-dns:1.0.0 is listed as a dependency but is not yet published;
  platform/external-dns/ has README + policies but no chart/ dir (issue #109
  scope). The umbrella build will fail on `helm dependency build` until #109
  authors the chart and publishes bp-external-dns:1.0.0. The dependency is
  declared anyway because the target-state contract per #104 is exactly 11
  leaves — partial declaration would be a quality compromise (principle #2).

Verified leaf chart names (platform/<x>/chart/Chart.yaml, all `bp-<x>`):
  cilium, cert-manager, flux, crossplane, sealed-secrets, spire,
  nats-jetstream, openbao, keycloak, gitea — all match.
Verified published OCI tags (10/11 at ghcr.io/openova-io/bp-<name>:1.0.0).
2026-04-28 19:39:48 +02:00
hatiyildiz
7fd24fb1c1 test(dynadot): #110 — add table-driven multi-domain ManagedDomains test matrix
Augments the existing #108-landed test suite with:
- TestManagedDomains_TableDriven — a single matrix asserting all seven
  resolution-order scenarios (canonical multi, whitespace-separated,
  case-insensitive, whitespace-trimmed query, legacy fallback, defaults
  fallback, canonical-precedence-over-legacy) in one place.
- TestAddSovereignRecords_AllUseAddDNSToCurrentSetting — explicit
  regression guard that EVERY one of the six AddSovereignRecords loop
  iterations carries add_dns_to_current_setting=yes (per
  feedback_dynadot_dns.md: a single missing flag wipes the zone).

The dynadot.go client itself was already complete after #108/921eabd —
ManagedDomains() reads DYNADOT_MANAGED_DOMAINS canonical, falls back to
DYNADOT_DOMAIN legacy single-value, then to built-in defaults. This
commit adds the consolidated table-driven coverage requested for #110.

Closes #110.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:37:49 +02:00
hatiyildiz
4554bd6d5d feat(dod): #149-#157 — Group M DoD scaffolding (DEMO-RUNBOOK + dod_test.go + dod.yaml)
Manual-dispatch-only DoD scaffolding for the omantel.omani.works
end-to-end test. Operator-gated; the test t.Skip()s when
HETZNER_TEST_TOKEN env var is missing so CI stays green.

- docs/DEMO-RUNBOOK.md: 9-step operator runbook covering Group C
  cutover, wizard provision, voucher issuance, tenant redemption.
- tests/dod/dod_test.go: HTTP-driven E2E that streams SSE through
  all 11 phases, asserts cert + DNS + voucher + redemption flow.
- .github/workflows/dod.yaml: workflow_dispatch only — never
  on-push (Hetzner cost gating).

Cherry-picked additive files from /tmp/agent-group-m-dod (a40b495);
the agent's branch had stale-base deletions of #108/#109/Pass-107
that we drop.
2026-04-28 19:34:46 +02:00
e3mrah
c3d6385974 provision: deploy tenant bakkal (plan: m, apps: 5) 2026-04-28 21:20:56 +04:00
hatiyildiz
e5bf5baab1 Merge branch 'docs/validation-log-pass-107' — Pass 107 audit-log entry 2026-04-28 14:52:38 +02:00
hatiyildiz
628b6a6bff docs(validation-log): pass 107 — Lessons #24/#25/#26 closures + waterfall completion snapshot
13 acceptance greps re-run on 14ff252; verdict NIRVANA. Cross-attests
Lesson #24 (bespoke Hetzner+helm-exec replaced with OpenTofu→Crossplane→Flux),
Lesson #25 (catalystBlueprint.upstream metadata block in all 10 G2 wrappers),
Lesson #26 (INVIOLABLE-PRINCIPLES.md anchored in 3 places). Records live
waterfall progress (~88%): A/B/D/F/H/I/J/L closed; C ready; E mostly closed;
K 7/8; G in-flight; M scaffolding. No new violations; no new lessons.
2026-04-28 14:51:50 +02:00
hatiyildiz
7d359668b3 fix(catalyst-api): #148 — eliminate race in CreateDeployment status read
Race detector caught a write/read race between the response writer's
read of dep.Status (line 101) and the runProvisioning goroutine's
mu-locked write at line 166. The reader doesn't take dep.mu, so
even though the goroutine writes under the lock the read isn't
synchronised. Capturing the status into a local before launching
the goroutine eliminates the race — the response carries the
known-just-set "provisioning" value verbatim.

Closes the recurring TestLoad_TenConcurrentDeploymentsAreIsolated
failure on cf60bd7, 333b859, f0fe300.
2026-04-28 14:49:02 +02:00