Cilium with --set k8sServiceHost=10.0.1.2 (the cp1 private NIC IP) sat
in init phase forever — the agent's API client kept logging
"Establishing connection to apiserver host=https://10.0.1.2:6443" and
never got a response, even though `curl https://10.0.1.2:6443/healthz`
from the host returned 401 (TLS+auth challenge = endpoint reachable).
Switching to k8sServiceHost=127.0.0.1 brought the DaemonSet up
immediately. Verified end-to-end on the live cluster:
$ kubectl get nodes
catalyst-omantel-omani-works-cp1 Ready ... 32m v1.31.4+k3s1
The node's local apiserver always binds 127.0.0.1:6443; using that as
the bootstrap apiserver endpoint sidesteps whatever was rejecting the
private-NIC IP route during Cilium's pre-CNI bring-up. Once Cilium is
the CNI and the cluster has real Service VIPs, every other component
reaches the apiserver via the kubernetes.default service as usual.
omantel.omani.works deployment 5cd1bceaaacb71f6 reached Phase 0 success
(10 Hetzner resources up, LB IP 49.12.16.160, DNS committed via PDM)
but stayed silent for 25 minutes — `https://console.omantel.omani.works`
returned no response, every Flux pod was Pending, and the node was
NotReady. SSH'd into the cp1 box (firewall opened temporarily for the
operator IP) and found the canonical CNI bootstrap deadlock:
Ready: False (KubeletNotReady)
message: container runtime network not ready: NetworkReady=false
reason:NetworkPluginNotReady cni plugin not initialized
cloud-init started k3s with --flannel-backend=none + --disable-network-policy
(the right Cilium-ready posture), then immediately applied the Flux
install.yaml. Flux pods are Pending because there is no CNI yet, so
Flux never starts → never reconciles bp-cilium → CNI never installs →
deadlock. The "wait for deployment Available --timeout=300s" line
silently times out and cloud-init proceeds anyway with the Flux
GitRepository + Kustomization that nothing reconciles.
Resolution: install Cilium ONCE in cloud-init via the canonical Helm
chart at the SAME version (1.16.5) that platform/cilium/blueprint.yaml
declares for bp-cilium. When Flux later reconciles
clusters/<sovereign_fqdn>/bootstrap-kit/01-cilium.yaml it adopts the
existing Helm release (release name + namespace match), so the wizard's
ownership model stays single-source-of-truth (Flux + Blueprints) after
the bootstrap exception.
Per INVIOLABLE-PRINCIPLES.md #3, this Helm install is the one-shot
bootstrap exception authorised by "the GitOps engine is Flux —
everything ELSE gets installed by Flux". Cilium IS the CNI Flux needs,
so it cannot be installed by Flux without bootstrapping itself first.
Every other component still flows through the Blueprint pipeline.
Verified: ssh'd into the running omantel cp1 (firewall opened for the
operator IP), ran the same `helm install cilium ...` command this
patch encodes, and the cluster recovered — node Ready, Flux pods
scheduling, GitRepository pulling. Will redeploy from scratch with
the patched cloud-init to validate the full unattended path.
Cloud-init is the Phase-0 OpenTofu artifact baked into the Hetzner
server's user_data, so this change activates on the NEXT `tofu apply`
that creates a new control-plane server. Existing omantel cp1 is
manually unblocked already; new Sovereigns provisioned after the
catalyst-api image with this template is rolled will not hit the
deadlock.
Closes the user-reported regression "this is empty are you sure this is
progressing?" — `/sovereign/provision/<id>` rendered `0 events · done`
even when the deployment succeeded with 10 Hetzner resources, because a
browser that connected after `event: done` arrived at an already-closed
channel with nothing to replay.
API:
- Add `eventsBuf` durable slice (mutex-guarded) on `Deployment`, capped
at 10,000 events with FIFO eviction so a runaway producer cannot OOM.
- Tee every emit through `recordEvent` — single source of truth for the
buffer + the live channel, so they cannot diverge.
- StreamLogs replays the buffer on connect; if the deployment is already
done, replays + emits `event: done` and closes.
- New `GET /api/v1/deployments/{id}/events` returns slice + state JSON
for stateless reconnect / fast-path render.
- `Deployment.State()` includes `numEvents` summary.
- New tests prove buffer fill, replay-on-completed, GET endpoint shape,
and FIFO eviction at cap.
UI:
- ProvisionPage fetches GET /events on mount BEFORE attaching the SSE
stream; replays through `applyEventToContext()` so a deep-link to a
completed deployment renders the FULL history of bubbles + log
entries instead of an empty shell.
- Live SSE `seen` counter de-duplicates the SSE replay-on-connect
against the GET fetch we already applied.
- Elapsed clock anchors on first event time for completed deployments.
- 4 new vitest tests (153 total) cover the GET fetch, completed-state
bubble flip, 404 graceful handling, and elapsed-clock anchor.
Closes#180.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wizard's /sovereign/provision/<id> page rendered only 2 supernodes
(Hetzner-infra + Flux-bootstrap) instead of the 11 bootstrap-kit
Blueprints + the user's selected components. Verified by greping the
deployed bundle:
$ kubectl exec -n catalyst <ui-pod> -- \
grep -c "bp-cilium\|bp-cert-manager" /usr/share/nginx/html/assets/index-*.js
0
Root cause: scripts/build-catalog.mjs computes REPO_ROOT relative to the
script's own location and walks platform/<name>/blueprint.yaml,
products/<name>/blueprint.yaml, clusters/_template/bootstrap-kit/. The
docker build context for catalyst-ui was set to
products/catalyst/bootstrap/ui/, so REPO_ROOT in the container resolved
to a directory ABOVE the build context that holds nothing. The script
silently emitted catalog.generated.ts with BOOTSTRAP_KIT = [] and
ALL_BLUEPRINTS = [], shipping an empty bundle.
Three coupled fixes (no bandaid):
1. scripts/build-catalog.mjs — accept OPENOVA_REPO_ROOT env override AND
fail loudly with a clear message if any of platform/, products/,
clusters/_template/bootstrap-kit/ is missing. A future
misconfigured context cannot silently regress the bundle.
2. products/catalyst/bootstrap/ui/Containerfile — build context is now
/repo (the OpenOva repo root). Containerfile COPYs the four needed
subtrees explicitly (platform/, products/, clusters/_template/
bootstrap-kit/, products/catalyst/bootstrap/ui/) and exports
OPENOVA_REPO_ROOT=/repo so the prebuild script picks them up.
3. .github/workflows/catalyst-build.yaml — UI build context flipped from
openova-src/products/catalyst/bootstrap/ui to openova-src. Plus a new
bootstrap-kit smoke test that asserts every bp-* id (cilium,
cert-manager, flux, crossplane, sealed-secrets, spire, nats-jetstream,
openbao, keycloak, gitea) is present in the built bundle. Failure of
this step fails the build — the regression is now caught in CI, not
by the user staring at an empty progress page.
Verified locally: `node scripts/build-catalog.mjs` still emits 11
blueprints when run from the dev path (env override falls back to the
relative-resolve mode).
Replaces the synthetic 2-tone classification (light=slate-900,
color=slate-100) with a per-brand surface map keyed by each project's
canonical homepage / press-kit colour. Every component's logo tile now
renders against its own brand surface — exactly how each project
displays its mark on its own homepage:
- Alloy → Grafana orange (#FF671D), white wordmark crisp
- FerretDB → navy (#042B41), fawn glyph clearly visible
- Temporal → signature blue (#127ED1), white logo crisp
- Cilium → navy (#1A2236), hexagon mosaic visible
- Grafana → dark navy (#0B0F19), orange-yellow gradient pops
- Cert-manager / OpenSearch → white tile (matches their on-white brand)
- Stalwart → navy (#100E42), coral red wordmark
- Strimzi → navy (#192C47), cyan accent visible
Per-brand surface is theme-INDEPENDENT — homepage logos look the same
regardless of viewer theme, and the wizard mirrors that. The card
BODY surrounding the tile still flips with the wizard theme; only the
LOGO TILE is brand-locked.
Internal letter-mark components without a finalized upstream brand
mark (axon, bge, continuum, specter, powerdns) are assigned distinct
slate / navy tones from the OpenOva platform palette so the letter
reads cleanly and the tile doesn't visually clash with neighbouring
brand tiles in the same family.
Backwards-compatibility shim retained: `getLogoToneStyle` aliases
`getLogoSurface`, so the four call sites (StepComponents, StepReview,
MarketplaceFamilyPage, MarketplaceProductPage) work unchanged. Their
descriptive comments are updated to reflect the per-brand semantics.
Refs #179
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Every tofu apply on a pool deployment was hitting:
null_resource.dns_pool[0]: Provisioning with 'local-exec'...
null_resource.dns_pool[0] (local-exec): (output suppressed due to sensitive value in config)
Error: Invalid field in API request
catalyst-dns: write DNS: add *.omantel record: dynadot api error: code=
Two separate code paths were both writing Dynadot records for the same
deployment:
1. The OpenTofu module's null_resource.dns_pool — a local-exec that
shells out to /usr/local/bin/catalyst-dns inside the catalyst-api
container. The binary's request payload is rejected by Dynadot.
2. catalyst-api's pool-domain-manager call — pdm.Commit() at
handler/deployments.go:247 writes the canonical record set with the
LB IP after tofu apply returns. This path works.
Per #168 PDM is the single owner of all pool-domain Dynadot writes.
The null_resource path is a pre-#168 artifact that should have been
removed when PDM took ownership; keeping it dual-wrote DNS records
(when it worked) and broke the entire provision flow (when it didn't).
Verified end-to-end against the live catalyst-api at
console.openova.io: tofu apply created 7 of 11 Hetzner resources
(network, firewall, subnet, LB, 2 LB services, ssh_key) before
failing at null_resource.dns_pool[0]. With this commit the DNS-write
step disappears from the plan, and PDM /commit handles record
creation after the LB IP is known.
The dynadot_key + dynadot_secret variables in variables.tf remain
declared (provisioner.go still passes them through tfvars.json) but
are no longer referenced by any resource. Removing them is a separate
sweep — left for a follow-up to keep this commit narrowly scoped to
the failure path.
Every tofu apply on a pool deployment was hitting:
null_resource.dns_pool[0]: Provisioning with 'local-exec'...
null_resource.dns_pool[0] (local-exec): (output suppressed due to sensitive value in config)
Error: Invalid field in API request
catalyst-dns: write DNS: add *.omantel record: dynadot api error: code=
Two separate code paths were both writing Dynadot records for the same
deployment:
1. The OpenTofu module's null_resource.dns_pool — a local-exec that
shells out to /usr/local/bin/catalyst-dns inside the catalyst-api
container. The binary's request payload is rejected by Dynadot.
2. catalyst-api's pool-domain-manager call — pdm.Commit() at
handler/deployments.go:247 writes the canonical record set with the
LB IP after tofu apply returns. This path works.
Per #168 PDM is the single owner of all pool-domain Dynadot writes.
The null_resource path is a pre-#168 artifact that should have been
removed when PDM took ownership; keeping it dual-wrote DNS records
(when it worked) and broke the entire provision flow (when it didn't).
Verified end-to-end against the live catalyst-api at
console.openova.io: tofu apply created 7 of 11 Hetzner resources
(network, firewall, subnet, LB, 2 LB services, ssh_key) before
failing at null_resource.dns_pool[0]. With this commit the DNS-write
step disappears from the plan, and PDM /commit handles record
creation after the LB IP is known.
The dynadot_key + dynadot_secret variables in variables.tf remain
declared (provisioner.go still passes them through tfvars.json) but
are no longer referenced by any resource. Removing them is a separate
sweep — left for a follow-up to keep this commit narrowly scoped to
the failure path.
The wizard's recommended Hetzner SKU is CPX32 (4 vCPU AMD / 8 GB / €0.0232/hr)
but the module's variables.tf validation rule only accepted the cx / ccx /
cax families — CPX (AMD shared) was missing entirely. Every Launch through
the wizard hit:
Error: Invalid value for variable
on variables.tf line 68: variable "control_plane_size" {
var.control_plane_size is "cpx32"
control_plane_size must match Hetzner server-type naming (cxNN | ccxNN | caxNN)
Solo Sovereigns (worker_count = 0) also legitimately have an empty
worker_size — the validation rejected that too:
Error: Invalid value for variable
on variables.tf line 91: variable "worker_size" {
var.worker_size is ""
Both fixed by extending the regex with the cpx* family AND permitting
the empty string on worker_size when the operator runs a solo Sovereign.
Reproduced end-to-end against the deployed catalyst-api before the fix:
the SSE stream surfaced exactly these two validation errors. With the
regex updated they no longer fire — failure now requires a real
Hetzner token instead of being blocked at module-validation time.
The universal `rgba(255,255,255,0.96)` tile from 691467b4 dropped
white-on-transparent brand marks (Temporal, LiveKit, Mimir, Tempo,
Velero, OpenBao …) into a blinding white pill — the user's "almost
nothing is visible" complaint.
Mirrors the SME marketplace's per-asset PNG approach
(https://marketplace.openova.io/apps/) with metadata-driven
backplates instead of universal chrome:
- new `logoTone.ts` classifies every vendored component logo as
`light` (white-glyph, needs slate-900 backplate) or `color`
(full-colour or dark-glyph, reads on slate-100). Both tones are
theme-independent — exactly like marketplace PNGs ship the same
surface regardless of card theme. Empirically validated against
every asset under public/component-logos/ on five candidate
surfaces.
- StepComponents.tsx — `.corp-comp-logo` tile + IconFallback now
consume `getLogoToneStyle(entry.id)`.
- StepReview.tsx — ComponentMiniCard 40×40 tile + LetterFallback
same.
- MarketplaceFamilyPage.tsx — `.mp-related-logo` / `.mp-related-icon`
CSS rules now own geometry only; surface is per-asset inline
style.
- MarketplaceProductPage.tsx — `.mp-product-logo` /
`.mp-product-icon` same pattern on the 80×80 hero tile.
Per-component verification (dark + light wizard themes):
Temporal — light tone → slate-900 backplate, white logo crisp
Cilium — color tone → slate-100, full hexagon visible
Cert-manager — color tone → slate-100, blue badge readable
Grafana — color tone → slate-100, orange G readable
Strimzi — color tone → slate-100, dark mark visible
Keycloak — color tone → slate-100, color badge readable
FerretDB — color tone → slate-100, wordmark + glyph visible
Gates: tsc --noEmit clean · 149/149 vitest tests pass · vite build OK.
The previous image bundled the infra/hetzner/ .tf sources but not the tofu
binary itself, so every Launch failed with:
tofu init: exec: "tofu": executable file not found in $PATH
Add a dedicated builder stage that downloads OpenTofu v1.11.6 from the
canonical GitHub release, verifies the SHA256 against the upstream
SHA256SUMS file before extraction, and ships the binary into the runtime
image at /usr/local/bin/tofu (mode 0755 so UID 65534 can exec it). The
stage branches on $TARGETARCH (amd64 / arm64) to keep multi-arch buildx
correct; both arch checksums are pinned as build args so version bumps
are an explicit two-line change.
Add a CI smoke step in catalyst-build.yaml's build-api job that runs
`tofu version` inside the freshly-built image and asserts the output
matches EXPECTED_TOFU_VERSION; failure fails the build. Also re-run with
`--user 65534:65534` to gate exec-as-non-root at build time. The prior
infra/hetzner/ presence smoke step is preserved unchanged.
Sibling fix in ProvisionPage's FailureCard: the kubectl hint pointed at
namespace `catalyst-system`, but catalyst-api actually runs in namespace
`catalyst` (per chart/templates/api-deployment.yaml + live cluster).
Replace the namespace literal so the diagnostic command copy-pastes
correctly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Provision page styled three surfaces with hardcoded
rgba(255,255,255,...) literals rather than the page's theme tokens.
The theme tokens (--s1, --md, --lo) already flip correctly under
.provision-shell[data-theme="light"], so any element painted with
the raw rgba was theme-locked to dark and washed out / invisible
against the light radial-gradient page background.
Three surfaces switched to tokens that already exist on the same
page and flip per-theme:
• DAG bubble label fill (pending state) — colour
rgba(255,255,255,0.45) → var(--lo)
Dark: --lo = rgba(255,255,255,0.40) (≈ same)
Light: --lo = #475569 (slate-600, readable on light bg)
• Live-log info-line text — color rgba(255,255,255,.78)
→ var(--md)
Dark: --md = rgba(255,255,255,0.65)
Light: --md = #334155 (readable on light log panel)
• Live-log meta pill + failure-card hint <code> background —
rgba(255,255,255,.04) → var(--s1)
Dark: --s1 = rgba(255,255,255,0.04) (unchanged)
Light: --s1 = #fff (lifted pill on slate page bg)
The wizard StepReview surfaces (Section / Field / RegionCard /
ComponentMiniCard) and the marketplace family/product pages were
already migrated off raw rgba in 4f6dd10a; logo TILES intentionally
keep rgba(255,255,255,0.96) per the documented contract in
StepComponents.tsx LOGO_TILE_BG (vendored brand marks render in
mixed treatments — dark glyphs designed for white backdrops, white
glyphs on transparent — and a near-white pill keeps every glyph
legible regardless of theme).
Verification:
• npx tsc --noEmit ✓
• npm run build ✓
• ./node_modules/.bin/vitest run — 149 passed (149) ✓
• Live wizard at /sovereign/wizard — every step's section
surfaces and card surfaces render with proper contrast in
BOTH dark and light themes; logo tiles still readable.
• Live marketplace at /sovereign/marketplace/family/cortex
and /sovereign/marketplace/product/axon — flat-section
layout intact, logo tiles crisp.
No layout, no test selectors, no router, no componentGroups.ts,
no providerSizes.ts changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Components section on StepReview rendered both a family-summary
mini-card grid (PILOT M5 / SPINE M5 R1 O1 / …) AND a per-component
card grid below. The summary was a duplicate read of the same data —
each per-component card already shows its family chip, so the strip
above counted what the cards already display. Drop it.
The per-component cards themselves were tiny `auto-fill,
minmax(180px, 1fr)` chips with logo + name + tier letter + family
chip. Replace with a pixel-mirror of the canonical `.stack-card` on
https://marketplace.openova.io/review/ — same horizontal flex
layout, 40×40 logo tile, semibold name, low-key category pill, and
single-line description. Tokens map 1:1 (light theme):
marketplace `--color-bg` → wizard `--wiz-bg-input`
marketplace `--color-border` → wizard `--wiz-border`
marketplace `--color-text-strong` → wizard `--wiz-text-hi`
marketplace `--color-text-dim` → wizard `--wiz-text-md` (desc),
`--wiz-text-sub` (cat)
Card geometry verified pixel-identical to marketplace at 1440px
width: padding 10.4px, gap 10.4px, border-radius 8px, card height
66.078125px, 2-column grid with 8px gap collapsing to 1 column under
700px. Tier (M/R/O) intentionally dropped — not on the canonical
card; the Components step before review already enforces tier
semantics. The legend below the grid goes with it.
Section + Field shells switched from `--wiz-bg-xs` to `--wiz-bg-sub`
so the card surfaces lift visibly off the section background in
light mode — the previous near-white tint was the same colour as the
cards, so cards visually melted into the section ("white-on-white").
Verification:
• npx tsc --noEmit ✓
• npm run build ✓
• ./node_modules/.bin/vitest run — 149 passed (149) ✓
• Live wizard at /sovereign/wizard step 7 — components section
renders 2-col grid of stack-card-shaped components, no family
summary, no tier legend, computed CSS matches marketplace.
POST body to /v1/deployments unchanged. componentGroups.ts,
provider/topology cards, router.tsx untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The catalyst-api Pod is the OpenTofu runner — provisioner.New() reads
CATALYST_TOFU_MODULE_PATH (default /infra/hetzner) and stageModule()
copies the canonical .tf / .tftpl files into a per-deployment workdir
on every Launch. The previous Containerfile did not COPY the module
in, so every Launch failed:
{"level":"ERROR","msg":"provision failed",
"err":"stage tofu module: open /infra/hetzner: no such file or directory"}
Containerfile changes
- Build context is now the public openova repo root (Containerfile
paths COPY from products/catalyst/bootstrap/api/ explicitly).
- New `COPY infra/hetzner/ /infra/hetzner/` brings the FULL tree
(main.tf, variables.tf, outputs.tf, versions.tf, cloudinit-*.tftpl,
README.md) into the runtime image. The path /infra/hetzner/ matches
provisioner.New()'s default and the catalyst-platform Helm chart's
CATALYST_TOFU_MODULE_PATH override.
Workflow changes (.github/workflows/catalyst-build.yaml, build-api job)
- context: openova-src/products/catalyst/bootstrap/api -> openova-src
(the repo root is needed so infra/hetzner/ is in the build context).
- Split build into Build (load: true) + Smoke + Push, mirroring the UI
job pattern. The smoke step runs `ls -la /infra/hetzner/` inside the
built image and asserts main.tf, variables.tf, outputs.tf, versions.tf,
and both cloudinit-*.tftpl files are present. Failure fails the build
— broken images can no longer ship.
Verification (local)
- go vet ./... + go test ./... in products/catalyst/bootstrap/api: clean
- docker build -f products/catalyst/bootstrap/api/Containerfile . at the
repo root succeeds; `docker run --rm --entrypoint sh catalyst-api:test
-c 'ls -la /infra/hetzner/'` lists main.tf, variables.tf, outputs.tf,
versions.tf, cloudinit-control-plane.tftpl, cloudinit-worker.tftpl.
provisioner.go business logic untouched. catalyst-platform Helm chart
api-deployment.yaml untouched (CATALYST_TOFU_MODULE_PATH already aligns
with /infra/hetzner).
The /provision/ route is registered against the router's
internal path; '/sovereign' is the basepath, stripped before matching.
The 'from: "/sovereign/provision/$deploymentId"' lookup matched no
route at runtime — TanStack Router throws 'Invariant failed' for any
useParams call against an unknown route id. Cast was hiding the type
error.
This unblocks the SPA route — /sovereign/provision/<id> now renders the
ProvisionPage without throwing.
The provision page was a 1198-line static public/provision.html artefact
plus a sibling provision.js / catalog.js triple. The .html URL was the
visible give-away that the page wasn't first-class — it was rendered
outside the React app, did not share design tokens, did not get bundled,
and could not consume the wizard's zustand store directly. The result
was a page that displayed "omantel.omani-works · SOLO · 0 components ·
Failed" with no actionable detail when something went wrong.
This commit deletes all three static artefacts and ships a real SPA
route at `/sovereign/provision/$deploymentId` instead. Same DAG visual,
same EventSource wiring, same phase→bubble state machine — but as a
React component that:
- reads the deploymentId from URL params (deep-linkable, refresh-safe)
- reads selectedComponents + topology from useWizardStore directly
- resolves the FQDN via resolveSovereignDomain(store) — fixes the
"omantel.omani-works" hyphen bug; the page now shows "omantel.omani.works"
- renders a real FailureCard when SSE surfaces status="failed", carrying
the deployment's actual error message + Retry / Back-to-wizard CTAs
- handles 404 / EventSource error with a clean retry surface
Wiring:
- New /sovereign/provision/$deploymentId route in router.tsx
- StepReview's provision() callback now navigates via router.navigate
instead of window.location.href = path('provision.html')
- BOOTSTRAP_KIT export added to catalog.generated.ts (read from
clusters/_template/bootstrap-kit/ at build time, ordered by NN- prefix)
so the React route can import the same source-of-truth the deleted
catalog.js used to surface as window.CATALYST_CATALOG
- emitPublicCatalog() removed from build-catalog.mjs — no static page
consumes it any more
Files deleted:
- public/provision.html
- public/provision.js
- public/catalog.js
Files added:
- src/pages/provision/ProvisionPage.tsx (1300+ lines: catalog read,
expandWithDependencies, buildNodes, buildEdges, computeLayout,
applyEvent state machine, sidebar, log panel, failure card, status
pill)
Verified: tsc clean, 149/149 vitest tests pass.
Review page packs small fields/cards in horizontal rows instead of stacking
them top-to-bottom. The Components section now renders every selected
component as its own mini-card (logo + name + family chip + tier) so the
operator sees exactly what will be installed, not just family-level
counts. Reduced section padding and dropped redundant whitespace between
rows so the review fits a typical viewport without scrolling.
The provision()-to-/v1/deployments POST body is unchanged — visual only.
Component-logos vendored under public/component-logos/ are upstream brand
marks rendered as-shipped — some are dark glyphs designed for white
backdrops, some are white glyphs on transparent (designed for dark
surfaces), some are full-colour. The previous tile (rgba(255,255,255,0.04)
with the icon-fallback using oklch hue rotation) made dark glyphs invisible
in dark mode and white glyphs invisible against the dim tile. Worse, the
contrast story was inconsistent across surfaces — the wizard cards, the
review page, and the marketplace family/product pages each picked their
own background.
This commit pins ONE tile contract used in every place a component logo
renders:
- background: rgba(255,255,255,0.96) (near-white pill, theme-independent)
- border-radius: 10px
- 1px outer border in --wiz-border-sub so the tile doesn't fight the card
- 6px internal padding so tight square SVGs aren't cropped
- IconFallback letter colour pinned to fixed slate (#0f172a) so the letter
reads against the white tile in BOTH dark- and light-mode themes
(--wiz-text-hi flips with the theme and would white-out in dark mode)
Files updated:
- StepComponents.tsx — .corp-comp-logo + IconFallback
- MarketplaceFamilyPage.tsx — .mp-related-logo + .mp-related-icon
- MarketplaceProductPage.tsx — .mp-product-logo + .mp-product-icon
Verified by toggling dark/light theme and walking the wizard +
marketplace pages — every brand mark legible regardless of glyph palette
or theme.
The wizard component cards were copying the SME marketplace's
`app-body { padding-right: 72px }` pattern, which reserves the right
quarter of every card for an absolute-positioned hover-only round Add
button. Combined with one- to three-word `desc` strings, every card
showed a name, a chip line, a single half-line of description, and a
visually empty right column — a quarter of valuable space wasted.
This change restructures the cards around a rigid 4-line grid that
spans the FULL body width:
Line 1 — name (left, flex) + family chip + inline toggle (right)
Line 2 — description line 1 (full width)
Line 3 — description line 2 (full width, two-line clamp)
Line 4 — tier chip + dependency chips + SELECTED dot (right)
Chips appear ONLY on line 1 or line 4, never on lines 2-3. The
`.corp-comp-body` no longer reserves any horizontal padding for
overlay buttons; descriptions use the entire body column.
The toggle affordance is relocated from an absolute-positioned 32×32
overlay (top-right of the card, opacity-0 until hover) to an inline
22×22 round button at the trailing edge of line 1, sharing the chip
row with the family chip. It still fades in on card hover and stays
visible when in-cart, but it occupies a single inline cell instead of
reserving a vertical column.
The bottom-right SELECTED text pill is replaced by a compact green
dot anchored to the right end of line 4. The card already conveys
selection through its green border, green-tinted background, and the
green ✓ toggle button on line 1; the loud text pill duplicated those
signals while crowding the dependency chips on cards with deps.
Every component description in `componentGroups.ts` is rewritten as a
6-10 word professional sentence-fragment distilled from the long-form
`COMPONENT_COPY.positioning` text in `marketplaceCopy.ts`. Same voice:
factual, technical, terse — no hype, no forbidden vocabulary.
Five before/after samples:
flux: "GitOps delivery engine" → "GitOps reconciler driving every Sovereign cluster from Git"
cilium: "CNI & eBPF service mesh" → "eBPF CNI and service mesh with kernel-level policy"
cert-manager:"TLS certificate automation" → "Automated TLS issuance and rotation for every ingress"
grafana: "Dashboards & alerting" → "Curated dashboards across metrics, logs, and traces"
langfuse: "LLM observability & tracing" → "Prompt, completion, and cost tracing for the AI plane"
All 63 component descriptions verified within 6-10 words; no
forbidden vocabulary ("MVP", "for now", "stub", "iterative", "demo");
no marketing fluff. CSS changes preserve the canonical 108px resting
height; tablet/mobile responsive floor unchanged. All 149 vitest
specs continue to pass; existing data-testid selectors
(`toggle-<id>`, `family-chip-<id>`, `tier-<id>`, `selected-<id>`,
`deps-<id>-<dep>`, `includes-<id>`, `component-card-<id>`) are
preserved unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>