The universal `rgba(255,255,255,0.96)` tile from 691467b4 dropped
white-on-transparent brand marks (Temporal, LiveKit, Mimir, Tempo,
Velero, OpenBao …) into a blinding white pill — the user's "almost
nothing is visible" complaint.
Mirrors the SME marketplace's per-asset PNG approach
(https://marketplace.openova.io/apps/) with metadata-driven
backplates instead of universal chrome:
- new `logoTone.ts` classifies every vendored component logo as
`light` (white-glyph, needs slate-900 backplate) or `color`
(full-colour or dark-glyph, reads on slate-100). Both tones are
theme-independent — exactly like marketplace PNGs ship the same
surface regardless of card theme. Empirically validated against
every asset under public/component-logos/ on five candidate
surfaces.
- StepComponents.tsx — `.corp-comp-logo` tile + IconFallback now
consume `getLogoToneStyle(entry.id)`.
- StepReview.tsx — ComponentMiniCard 40×40 tile + LetterFallback
same.
- MarketplaceFamilyPage.tsx — `.mp-related-logo` / `.mp-related-icon`
CSS rules now own geometry only; surface is per-asset inline
style.
- MarketplaceProductPage.tsx — `.mp-product-logo` /
`.mp-product-icon` same pattern on the 80×80 hero tile.
Per-component verification (dark + light wizard themes):
Temporal — light tone → slate-900 backplate, white logo crisp
Cilium — color tone → slate-100, full hexagon visible
Cert-manager — color tone → slate-100, blue badge readable
Grafana — color tone → slate-100, orange G readable
Strimzi — color tone → slate-100, dark mark visible
Keycloak — color tone → slate-100, color badge readable
FerretDB — color tone → slate-100, wordmark + glyph visible
Gates: tsc --noEmit clean · 149/149 vitest tests pass · vite build OK.
The previous image bundled the infra/hetzner/ .tf sources but not the tofu
binary itself, so every Launch failed with:
tofu init: exec: "tofu": executable file not found in $PATH
Add a dedicated builder stage that downloads OpenTofu v1.11.6 from the
canonical GitHub release, verifies the SHA256 against the upstream
SHA256SUMS file before extraction, and ships the binary into the runtime
image at /usr/local/bin/tofu (mode 0755 so UID 65534 can exec it). The
stage branches on $TARGETARCH (amd64 / arm64) to keep multi-arch buildx
correct; both arch checksums are pinned as build args so version bumps
are an explicit two-line change.
Add a CI smoke step in catalyst-build.yaml's build-api job that runs
`tofu version` inside the freshly-built image and asserts the output
matches EXPECTED_TOFU_VERSION; failure fails the build. Also re-run with
`--user 65534:65534` to gate exec-as-non-root at build time. The prior
infra/hetzner/ presence smoke step is preserved unchanged.
Sibling fix in ProvisionPage's FailureCard: the kubectl hint pointed at
namespace `catalyst-system`, but catalyst-api actually runs in namespace
`catalyst` (per chart/templates/api-deployment.yaml + live cluster).
Replace the namespace literal so the diagnostic command copy-pastes
correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Provision page styled three surfaces with hardcoded
rgba(255,255,255,...) literals rather than the page's theme tokens.
The theme tokens (--s1, --md, --lo) already flip correctly under
.provision-shell[data-theme="light"], so any element painted with
the raw rgba was theme-locked to dark and washed out / invisible
against the light radial-gradient page background.
Three surfaces switched to tokens that already exist on the same
page and flip per-theme:
• DAG bubble label fill (pending state) — colour
rgba(255,255,255,0.45) → var(--lo)
Dark: --lo = rgba(255,255,255,0.40) (≈ same)
Light: --lo = #475569 (slate-600, readable on light bg)
• Live-log info-line text — color rgba(255,255,255,.78)
→ var(--md)
Dark: --md = rgba(255,255,255,0.65)
Light: --md = #334155 (readable on light log panel)
• Live-log meta pill + failure-card hint <code> background —
rgba(255,255,255,.04) → var(--s1)
Dark: --s1 = rgba(255,255,255,0.04) (unchanged)
Light: --s1 = #fff (lifted pill on slate page bg)
The wizard StepReview surfaces (Section / Field / RegionCard /
ComponentMiniCard) and the marketplace family/product pages were
already migrated off raw rgba in 4f6dd10a; logo TILES intentionally
keep rgba(255,255,255,0.96) per the documented contract in
StepComponents.tsx LOGO_TILE_BG (vendored brand marks render in
mixed treatments — dark glyphs designed for white backdrops, white
glyphs on transparent — and a near-white pill keeps every glyph
legible regardless of theme).
Verification:
• npx tsc --noEmit ✓
• npm run build ✓
• ./node_modules/.bin/vitest run — 149 passed (149) ✓
• Live wizard at /sovereign/wizard — every step's section
surfaces and card surfaces render with proper contrast in
BOTH dark and light themes; logo tiles still readable.
• Live marketplace at /sovereign/marketplace/family/cortex
and /sovereign/marketplace/product/axon — flat-section
layout intact, logo tiles crisp.
No layout, no test selectors, no router, no componentGroups.ts,
no providerSizes.ts changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Components section on StepReview rendered both a family-summary
mini-card grid (PILOT M5 / SPINE M5 R1 O1 / …) AND a per-component
card grid below. The summary was a duplicate read of the same data —
each per-component card already shows its family chip, so the strip
above counted what the cards already display. Drop it.
The per-component cards themselves were tiny `auto-fill,
minmax(180px, 1fr)` chips with logo + name + tier letter + family
chip. Replace with a pixel-mirror of the canonical `.stack-card` on
https://marketplace.openova.io/review/ — same horizontal flex
layout, 40×40 logo tile, semibold name, low-key category pill, and
single-line description. Tokens map 1:1 (light theme):
marketplace `--color-bg` → wizard `--wiz-bg-input`
marketplace `--color-border` → wizard `--wiz-border`
marketplace `--color-text-strong` → wizard `--wiz-text-hi`
marketplace `--color-text-dim` → wizard `--wiz-text-md` (desc),
`--wiz-text-sub` (cat)
Card geometry verified pixel-identical to marketplace at 1440px
width: padding 10.4px, gap 10.4px, border-radius 8px, card height
66.078125px, 2-column grid with 8px gap collapsing to 1 column under
700px. Tier (M/R/O) intentionally dropped — not on the canonical
card; the Components step before review already enforces tier
semantics. The legend below the grid goes with it.
Section + Field shells switched from `--wiz-bg-xs` to `--wiz-bg-sub`
so the card surfaces lift visibly off the section background in
light mode — the previous near-white tint was the same colour as the
cards, so cards visually melted into the section ("white-on-white").
Verification:
• npx tsc --noEmit ✓
• npm run build ✓
• ./node_modules/.bin/vitest run — 149 passed (149) ✓
• Live wizard at /sovereign/wizard step 7 — components section
renders 2-col grid of stack-card-shaped components, no family
summary, no tier legend, computed CSS matches marketplace.
POST body to /v1/deployments unchanged. componentGroups.ts,
provider/topology cards, router.tsx untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The catalyst-api Pod is the OpenTofu runner — provisioner.New() reads
CATALYST_TOFU_MODULE_PATH (default /infra/hetzner) and stageModule()
copies the canonical .tf / .tftpl files into a per-deployment workdir
on every Launch. The previous Containerfile did not COPY the module
in, so every Launch failed:
{"level":"ERROR","msg":"provision failed",
"err":"stage tofu module: open /infra/hetzner: no such file or directory"}
Containerfile changes
- Build context is now the public openova repo root (Containerfile
paths COPY from products/catalyst/bootstrap/api/ explicitly).
- New `COPY infra/hetzner/ /infra/hetzner/` brings the FULL tree
(main.tf, variables.tf, outputs.tf, versions.tf, cloudinit-*.tftpl,
README.md) into the runtime image. The path /infra/hetzner/ matches
provisioner.New()'s default and the catalyst-platform Helm chart's
CATALYST_TOFU_MODULE_PATH override.
Workflow changes (.github/workflows/catalyst-build.yaml, build-api job)
- context: openova-src/products/catalyst/bootstrap/api -> openova-src
(the repo root is needed so infra/hetzner/ is in the build context).
- Split build into Build (load: true) + Smoke + Push, mirroring the UI
job pattern. The smoke step runs `ls -la /infra/hetzner/` inside the
built image and asserts main.tf, variables.tf, outputs.tf, versions.tf,
and both cloudinit-*.tftpl files are present. Failure fails the build
— broken images can no longer ship.
Verification (local)
- go vet ./... + go test ./... in products/catalyst/bootstrap/api: clean
- docker build -f products/catalyst/bootstrap/api/Containerfile . at the
repo root succeeds; `docker run --rm --entrypoint sh catalyst-api:test
-c 'ls -la /infra/hetzner/'` lists main.tf, variables.tf, outputs.tf,
versions.tf, cloudinit-control-plane.tftpl, cloudinit-worker.tftpl.
provisioner.go business logic untouched. catalyst-platform Helm chart
api-deployment.yaml untouched (CATALYST_TOFU_MODULE_PATH already aligns
with /infra/hetzner).
The /provision/ route is registered against the router's
internal path; '/sovereign' is the basepath, stripped before matching.
The 'from: "/sovereign/provision/$deploymentId"' lookup matched no
route at runtime — TanStack Router throws 'Invariant failed' for any
useParams call against an unknown route id. Cast was hiding the type
error.
This unblocks the SPA route — /sovereign/provision/<id> now renders the
ProvisionPage without throwing.
The provision page was a 1198-line static public/provision.html artefact
plus a sibling provision.js / catalog.js triple. The .html URL was the
visible give-away that the page wasn't first-class — it was rendered
outside the React app, did not share design tokens, did not get bundled,
and could not consume the wizard's zustand store directly. The result
was a page that displayed "omantel.omani-works · SOLO · 0 components ·
Failed" with no actionable detail when something went wrong.
This commit deletes all three static artefacts and ships a real SPA
route at `/sovereign/provision/$deploymentId` instead. Same DAG visual,
same EventSource wiring, same phase→bubble state machine — but as a
React component that:
- reads the deploymentId from URL params (deep-linkable, refresh-safe)
- reads selectedComponents + topology from useWizardStore directly
- resolves the FQDN via resolveSovereignDomain(store) — fixes the
"omantel.omani-works" hyphen bug; the page now shows "omantel.omani.works"
- renders a real FailureCard when SSE surfaces status="failed", carrying
the deployment's actual error message + Retry / Back-to-wizard CTAs
- handles 404 / EventSource error with a clean retry surface
Wiring:
- New /sovereign/provision/$deploymentId route in router.tsx
- StepReview's provision() callback now navigates via router.navigate
instead of window.location.href = path('provision.html')
- BOOTSTRAP_KIT export added to catalog.generated.ts (read from
clusters/_template/bootstrap-kit/ at build time, ordered by NN- prefix)
so the React route can import the same source-of-truth the deleted
catalog.js used to surface as window.CATALYST_CATALOG
- emitPublicCatalog() removed from build-catalog.mjs — no static page
consumes it any more
Files deleted:
- public/provision.html
- public/provision.js
- public/catalog.js
Files added:
- src/pages/provision/ProvisionPage.tsx (1300+ lines: catalog read,
expandWithDependencies, buildNodes, buildEdges, computeLayout,
applyEvent state machine, sidebar, log panel, failure card, status
pill)
Verified: tsc clean, 149/149 vitest tests pass.
Review page packs small fields/cards in horizontal rows instead of stacking
them top-to-bottom. The Components section now renders every selected
component as its own mini-card (logo + name + family chip + tier) so the
operator sees exactly what will be installed, not just family-level
counts. Reduced section padding and dropped redundant whitespace between
rows so the review fits a typical viewport without scrolling.
The provision()-to-/v1/deployments POST body is unchanged — visual only.
Component-logos vendored under public/component-logos/ are upstream brand
marks rendered as-shipped — some are dark glyphs designed for white
backdrops, some are white glyphs on transparent (designed for dark
surfaces), some are full-colour. The previous tile (rgba(255,255,255,0.04)
with the icon-fallback using oklch hue rotation) made dark glyphs invisible
in dark mode and white glyphs invisible against the dim tile. Worse, the
contrast story was inconsistent across surfaces — the wizard cards, the
review page, and the marketplace family/product pages each picked their
own background.
This commit pins ONE tile contract used in every place a component logo
renders:
- background: rgba(255,255,255,0.96) (near-white pill, theme-independent)
- border-radius: 10px
- 1px outer border in --wiz-border-sub so the tile doesn't fight the card
- 6px internal padding so tight square SVGs aren't cropped
- IconFallback letter colour pinned to fixed slate (#0f172a) so the letter
reads against the white tile in BOTH dark- and light-mode themes
(--wiz-text-hi flips with the theme and would white-out in dark mode)
Files updated:
- StepComponents.tsx — .corp-comp-logo + IconFallback
- MarketplaceFamilyPage.tsx — .mp-related-logo + .mp-related-icon
- MarketplaceProductPage.tsx — .mp-product-logo + .mp-product-icon
Verified by toggling dark/light theme and walking the wizard +
marketplace pages — every brand mark legible regardless of glyph palette
or theme.
The wizard component cards were copying the SME marketplace's
`app-body { padding-right: 72px }` pattern, which reserves the right
quarter of every card for an absolute-positioned hover-only round Add
button. Combined with one- to three-word `desc` strings, every card
showed a name, a chip line, a single half-line of description, and a
visually empty right column — a quarter of valuable space wasted.
This change restructures the cards around a rigid 4-line grid that
spans the FULL body width:
Line 1 — name (left, flex) + family chip + inline toggle (right)
Line 2 — description line 1 (full width)
Line 3 — description line 2 (full width, two-line clamp)
Line 4 — tier chip + dependency chips + SELECTED dot (right)
Chips appear ONLY on line 1 or line 4, never on lines 2-3. The
`.corp-comp-body` no longer reserves any horizontal padding for
overlay buttons; descriptions use the entire body column.
The toggle affordance is relocated from an absolute-positioned 32×32
overlay (top-right of the card, opacity-0 until hover) to an inline
22×22 round button at the trailing edge of line 1, sharing the chip
row with the family chip. It still fades in on card hover and stays
visible when in-cart, but it occupies a single inline cell instead of
reserving a vertical column.
The bottom-right SELECTED text pill is replaced by a compact green
dot anchored to the right end of line 4. The card already conveys
selection through its green border, green-tinted background, and the
green ✓ toggle button on line 1; the loud text pill duplicated those
signals while crowding the dependency chips on cards with deps.
Every component description in `componentGroups.ts` is rewritten as a
6-10 word professional sentence-fragment distilled from the long-form
`COMPONENT_COPY.positioning` text in `marketplaceCopy.ts`. Same voice:
factual, technical, terse — no hype, no forbidden vocabulary.
Five before/after samples:
flux: "GitOps delivery engine" → "GitOps reconciler driving every Sovereign cluster from Git"
cilium: "CNI & eBPF service mesh" → "eBPF CNI and service mesh with kernel-level policy"
cert-manager:"TLS certificate automation" → "Automated TLS issuance and rotation for every ingress"
grafana: "Dashboards & alerting" → "Curated dashboards across metrics, logs, and traces"
langfuse: "LLM observability & tracing" → "Prompt, completion, and cost tracing for the AI plane"
All 63 component descriptions verified within 6-10 words; no
forbidden vocabulary ("MVP", "for now", "stub", "iterative", "demo");
no marketing fluff. CSS changes preserve the canonical 108px resting
height; tablet/mobile responsive floor unchanged. All 149 vitest
specs continue to pass; existing data-testid selectors
(`toggle-<id>`, `family-chip-<id>`, `tier-<id>`, `selected-<id>`,
`deps-<id>-<dep>`, `includes-<id>`, `component-card-<id>`) are
preserved unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wizard step order was inverted: it asked for the provider before the
topology, then put hetzner-only SKUs inside the topology step. Topology
decides how many regions exist; provider is a per-region property; SKU
vocabulary is per-provider (cx32 means nothing on Azure). Fixes all three.
New step order (WIZARD_STEPS + WizardPage STEPS): Org -> Topology ->
Provider -> Credentials -> Components -> Domain -> Review.
Per-provider SKU catalog at products/catalyst/bootstrap/ui/src/shared/
constants/providerSizes.ts replaces the legacy hetzner-only HETZNER_NODE_SIZES.
Five providers (hetzner, huawei, oci, aws, azure), each with realistic SKU
options drawn from that vendor's native instance-type vocabulary. Every
SKU read in the wizard goes through PROVIDER_NODE_SIZES[provider] -- no
SKU literal lives anywhere else.
StepProvider now renders one card per topology slot. Each card carries:
provider chooser, that provider's region picker, that provider's
control-plane SKU, that provider's worker SKU + count. Cost rollup sums
each region's (cp + worker*count) at its OWN provider's pricing, so a
mixed-cloud topology computes correctly.
StepTopology drops the SkuCard + NodeSizingPanel; it now captures only
the topology template, HA flag, and AIR-GAP add-on.
Per-region store fields (regionControlPlaneSizes, regionWorkerSizes,
regionWorkerCounts) replace the singular controlPlaneSize/workerSize/
workerCount as the canonical shape. Migration in store.merge() hydrates
the arrays from any persisted singular fields; the cx22 legacy default
is treated as "no selection" so a hetzner-only id never leaks into a
non-hetzner region.
Backend Request gains an optional Regions []RegionSpec field. Validate
mirrors Regions[0] into the legacy singular fields for the existing
solo-Hetzner writeTfvars path. infra/hetzner/variables.tf accepts the
list-of-objects shape; the for_each iteration that activates the rest
of the regions is the multi-region tofu wiring follow-up. Door open
structurally; no shape compromised.
Dead code removed: StepInfrastructure and shared/constants/hetzner.ts
(both orphaned, contained the only HETZNER_NODE_SIZES reference outside
the catalog).
Gates: tsc --noEmit, vite build, vitest (149 tests), go vet, go test
(provisioner + handler).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The d3346441 family-chip refactor bumped the wizard component-card
height from 108px to 130px and added an always-visible "Add / Selected"
pill button at bottom-right with a 1.85rem padding-bottom carved into
the card body. That broke the documented "pixel-match SME marketplace"
contract — the corporate Catalyst wizard cards no longer matched
https://marketplace.openova.io/apps/.
Restore the canonical SME marketplace card surface:
- card height: 130px → 108px (read-only path stays 108px, no special-case)
- body padding: 0.4rem right + 1.85rem bottom → 4.5rem right (no bottom)
- replace the bottom-right "Select / Selected" pill button with the
canonical 32×32 round icon button at top-right (Plus → Check),
opacity 0 by default, opacity 1 on card hover, always visible
when in-cart (mirrors AppsStep.svelte .app-add-btn 1:1)
- re-introduce the bottom-right SELECTED status pill (only when
in-cart) — mirrors AppsStep.svelte .status-corner / .s-selected
- render dependencies as one chip per dep ("+ DepName"), matching
AppsStep's chip-dep pattern (replaces the single deps-count chip
+ extra paragraph that forced the height bloat)
- keep the test/a11y `includes-<id>` paragraph but absolute-position
it off-screen (sr-only) so layout stays at 108px
Affordance reconciliation (no card-height growth):
- the entire card is now an anchor to /marketplace/product/<id>,
matching SME's `<a href="/app?slug=X" class="app-card">` wrapper
- the family chip nested inside is a `<Link>` to
/marketplace/family/<id> with stopPropagation
- the round +/✓ button stops propagation and toggles selection via
the wizard store (data-testid=`toggle-<id>` preserved for tests)
- all three navigation surfaces preserved: family chip → family
portfolio, card body → product detail, +/✓ button → wizard store
Read-only Tab 2 ("Always Included") path unchanged behaviourally —
renders as a plain `<div>` (not a `<Link>`) so it stays inert.
All 145 vitest cases pass (including the 89 in StepComponents.test.tsx).
TypeScript clean. Production vite build clean.
Refs: docs/INVIOLABLE-PRINCIPLES.md #2 (never compromise quality —
the SME marketplace IS the proven shape; do not diverge from it).
Agent 1 (#176 logos) sourced each component's official upstream brand
mark in whatever format the project itself publishes — most projects
ship SVG, but Grafana docs (loki/mimir/tempo), Aqua (trivy), Anchore
(syft-grype), the LangFuse repo, vLLM, Ntfy, FerretDB, OpenMeter,
Coraza, External-DNS, NetBird, and StrongSwan only publish PNG. The
old smoke test hard-asserted every spot-checked id resolved as
.svg, so the langfuse PNG broke the build.
Replaced the hardcoded extension loop with an explicit list of full
paths matching componentGroups.ts. Every entry mirrors the actual
logoUrl the wizard renders, so a missing or mis-named asset still
fails the build — but in lockstep with the data file, not against
a stale extension assumption.
Every platform-component card now renders the OFFICIAL upstream brand
mark instead of a stylized OpenOva placeholder. Logos are sourced from
the CNCF artwork repo and each project's own repository:
Source Components
────────────────────────────────────────────────────────────────────
cncf/artwork cert-manager, cilium, cnpg
(cloudnativepg), crossplane, envoy,
external-secrets (eso), falco,
flux, harbor, keda, keycloak,
knative, kserve, kyverno, litmus,
opentelemetry, opentofu, sigstore,
strimzi, vpa (kubernetes)
Project repo alloy, clickhouse, debezium,
ferretdb, frpc, gitea, grafana,
iceberg, kserve, langfuse,
librechat, livekit, loki, matrix,
milvus, mimir, neo4j, netbird,
ntfy, openbao, openmeter,
opensearch, reloader, seaweedfs,
stalwart, strongswan, stunner,
superset, syft-grype, temporal,
tempo, trivy, valkey, vcluster,
velero, vllm, flink, coraza
44 components ship as SVG; 14 components whose upstream publishes only
PNG marks (Loki, Mimir, Tempo, Trivy, NetBird, ntfy, OpenMeter, vLLM,
Coraza, Ferret, Syft+Grype, External-DNS, strongSwan, LangFuse) ship
as `<id>.png` with an explicit `logoUrl` override.
Five components retain `logoUrl: null` (letter-mark fallback): the
existing PowerDNS plus BGE (a model-family identifier rather than a
branded product) and the OpenOva-internal Axon, Continuum, Specter
components whose brand marks are not yet finalized.
Card markup, `depends:`, and family flags are intentionally not
touched in this commit (handled by parallel agents).
Quality gates:
- npx tsc --noEmit green
- npm run build green
- vitest StepComponents.test 90/90 passed
Drop the 1100-line static-mock provision.html in favour of a runtime-
generated DAG keyed off the wizard's persisted localStorage state and the
build-time blueprint catalog. Bubbles, edges, sub-progress, log routing
and final CTA are all computed from real backend data.
What is now dynamic:
- Hardcoded NODES/TOPO/EDGES/LOGS arrays gone — DAG is built from
window.CATALYST_CATALOG (components + bootstrap-kit) and the wizard
selection at page load.
- One Hetzner-infra supernode and one Flux-bootstrap supernode anchor the
graph; bootstrap-kit Blueprints render in numeric install order; user
selection from selectedComponents (with transitive HARD deps expanded
via blueprint.depends) makes up the rest.
- EventSource wired to <BASE>api/v1/deployments/<id>/logs. Phase events
drive bubble state transitions (tofu-init|tofu-plan run Hetzner-infra
through 0→.30 progress; raw `tofu` lines parse hcloud_network/
hcloud_firewall/hcloud_server/hcloud_load_balancer markers to advance
the supernode's sub-progress; tofu-output finishes it; flux-bootstrap
opens the second supernode).
- Raw stdout/stderr lines stream into the live-log panel and the active
bubble's expandable detail (per-node accumulation, click any bubble to
read its slice).
- On `event: done` with status=ready, surface "Open Console →" CTA
pointing at result.consoleURL from the snapshot.
- Empty-state path renders a clean "no active deployment" view when the
page is hit without a wizard session or deploymentId in localStorage.
Build-catalog change:
- scripts/build-catalog.mjs now also emits public/catalog.js setting
window.CATALYST_CATALOG = { components, bootstrapKit }. bootstrapKit is
read from clusters/_template/bootstrap-kit/ (numbered prefix → install
order). Same scan as the typed catalog.generated.ts so both surfaces
stay in lock-step.
Per-component states beyond flux-bootstrap are not yet emitted by
catalyst-api; nodeForPhase() already routes phase=bp-<slug> events onto
the matching bubble so wiring the Flux Kustomization watcher on the
backend lights up the rest with no further page work.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Order changes:
- StepOrg now scope: name, industry, size, HQ, compliance only (email
+ domain inputs removed)
- New step sequence: Org -> Provider -> Credentials -> Topology ->
Components -> Domain -> Review
- StepDomain captures the admin contact email alongside the Sovereign
FQDN; the email pairs naturally with the deployment's external
surface (Let's Encrypt registration, completion notifications)
- WIZARD_STEPS labels updated to match the new flow
StepReview revamp — single source of truth for the POST body:
- Sections in order: Organisation, Cloud Provider, SSH Access,
Topology (incl. workerSize / workerCount with defensive null-guard),
Components, Domain (admin email lives here)
- Hetzner token + registrar token rendered as fixed-length mask plus
character count, never plaintext (INVIOLABLE-PRINCIPLES.md #10)
- SSH source row distinguishes auto-generated vs. pasted; fingerprint
truncated for readability
- Domain section explicitly shows the resolved FQDN and the chosen mode
Gates: tsc --noEmit clean, vite build green, vitest 146/146 pass,
dev server boots cleanly on /sovereign/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Operator-reported defect: "if I select spectoer it is brining the entire
fabric family as well, I dont think there is such depenency in reality".
The user is right — the cascade was over-broad.
Root cause: PRODUCTS['cortex'].familyDependencies = ['fabric']. With
CORTEX.cascadeOnMemberSelection = true, selecting any CORTEX member (or
Specter, whose component-level deps include CORTEX members) walked the
family graph and pulled every FABRIC à-la-carte member — Strimzi/Kafka,
Debezium/CDC, Flink, Temporal, ClickHouse, Iceberg, Superset — onto the
selection. Specter and the rest of CORTEX have no runtime dependency on
any of those workloads. The only real cross-family need is cnpg (for
LangFuse) and a Mongo-compatible store (for LibreChat). cnpg is already
mandatory via transitive promotion; the Mongo backend is satisfied by
FerretDB, which is now reached via the corrected component-level dep
(librechat → ferretdb → cnpg).
Changes (one line per change):
- componentGroups.ts: PRODUCTS.cortex.familyDependencies: ['fabric'] → []
- componentGroups.ts: librechat.dependencies: ['cnpg'] → ['ferretdb']
(LibreChat speaks MongoDB, not PG; FerretDB cascades cnpg transitively)
- componentGroups.ts: grafana.dependencies: ['seaweedfs'] → []
(Grafana the dashboard server uses SQLite/PG; only its companion stores
Loki/Mimir/Tempo need object storage)
- StepComponents.test.tsx: regression test "selecting Specter does NOT
auto-select the FABRIC family" + companion tests asserting CORTEX
familyDependencies is empty, librechat → ferretdb, grafana has no deps,
and addProduct(cortex) does not drag à-la-carte FABRIC members.
Verification:
- npm run test (vitest run): 150/150 pass on the worktree
- npx tsc --noEmit: clean
- npm run build: clean
- Live wizard probe (vite dev): addComponent('specter') yields 34 ids,
zero of {strimzi, debezium, flink, temporal, clickhouse, iceberg,
superset} present, full CORTEX family present, librechat → ferretdb
cascade fires.
The CORTEX cascadeOnMemberSelection flag remains true (per issue #175
operator intent: "BGE alone doesn't have much meaning unless we have
Cortex"). FABRIC stays cascadeOnMemberSelection: false (à-la-carte). The
wizard now mirrors real-world component coupling: Specter brings only
the CORTEX runtime members it actually needs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the wizard polish gap "Selecting the shapes of the worker
nodes should be there." StepInfrastructure had a worker SKU + count
selector but was never wired into WizardPage.STEPS — the user walks
through StepTopology, where no sizing controls existed.
Adds a NodeSizingPanel inside StepTopology that:
• Renders control-plane and worker SKU cards from
HETZNER_NODE_SIZES (single source of truth — no SKU duplication).
• Exposes a worker-count stepper and editable spinbutton, clamped
to the topology-aware floor (0 for solo, 3 for multi-region) and
a ceiling of 6 to stay inside Hetzner's default project quota.
• Shows the worker SKU grid only when count > 0.
• Surfaces a hard validation error when count > 0 but workerSize
is unset; gates the Topology step's Continue button on the same.
Updates the store's setTopology to seed the worker-count default at
topology-pick time (solo → 0, multi-region → max(current, 3)) so
users land on a sensible default and the existing partialize() rules
keep persisting controlPlaneSize / workerSize / workerCount across
sessions unchanged.
StepReview now renders three chips inside the Infrastructure section
(control plane, workers, compute-total cost rollup) so the SKU + count
choice is visible at launch time, alongside the per-region cards
that were already there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Launch failed instantly with "create workdir: mkdir /var/lib/catalyst:
permission denied". The catalyst-api Pod runs as UID 65534 with emptyDir
mounts only at /tmp and /home/nonroot — /var/lib was never writable, so
the provisioner.New() default for CATALYST_TOFU_WORKDIR
(/var/lib/catalyst/tofu) lost on the very first MkdirAll call.
Three coupled fixes:
- Set CATALYST_TOFU_WORKDIR=/tmp/catalyst/tofu so the per-deployment
workdir tree lands in the existing /tmp emptyDir.
- Bump cpu limit 100m → 1000m, memory limit 64Mi → 1Gi. tofu init pulls
~80MB hcloud + ~30MB dynadot provider plugins; tofu plan/apply hold
the state file in memory; 64Mi was always going to OOM on first init.
- Grow /tmp emptyDir sizeLimit 256Mi → 2Gi to fit the per-Sovereign
subdirectory tree (provider binaries + state + plan output).
Manifest-only change — Flux reconciles, kubectl rollout swaps the Pod,
no image rebuild required.