Commit Graph

440 Commits

Author SHA1 Message Date
github-actions[bot]
a646afa041 deploy: update catalyst images to dc07b0d 2026-04-29 12:24:02 +00:00
hatiyildiz
dc07b0d68e merge: logo tile mirrors canonical marketplace treatment (theme-aware, Temporal visible) 2026-04-29 14:21:56 +02:00
hatiyildiz
5ba0c1c53b fix(wizard): logo tile mirrors canonical marketplace treatment (theme-aware, Temporal visible)
The universal `rgba(255,255,255,0.96)` tile from 691467b4 dropped
white-on-transparent brand marks (Temporal, LiveKit, Mimir, Tempo,
Velero, OpenBao …) into a blinding white pill — the user's "almost
nothing is visible" complaint.

Mirrors the SME marketplace's per-asset PNG approach
(https://marketplace.openova.io/apps/) with metadata-driven
backplates instead of universal chrome:

  - new `logoTone.ts` classifies every vendored component logo as
    `light` (white-glyph, needs slate-900 backplate) or `color`
    (full-colour or dark-glyph, reads on slate-100). Both tones are
    theme-independent — exactly like marketplace PNGs ship the same
    surface regardless of card theme. Empirically validated against
    every asset under public/component-logos/ on five candidate
    surfaces.
  - StepComponents.tsx — `.corp-comp-logo` tile + IconFallback now
    consume `getLogoToneStyle(entry.id)`.
  - StepReview.tsx — ComponentMiniCard 40×40 tile + LetterFallback
    same.
  - MarketplaceFamilyPage.tsx — `.mp-related-logo` / `.mp-related-icon`
    CSS rules now own geometry only; surface is per-asset inline
    style.
  - MarketplaceProductPage.tsx — `.mp-product-logo` /
    `.mp-product-icon` same pattern on the 80×80 hero tile.

Per-component verification (dark + light wizard themes):
  Temporal       — light tone → slate-900 backplate, white logo crisp
  Cilium         — color tone → slate-100, full hexagon visible
  Cert-manager   — color tone → slate-100, blue badge readable
  Grafana        — color tone → slate-100, orange G readable
  Strimzi        — color tone → slate-100, dark mark visible
  Keycloak       — color tone → slate-100, color badge readable
  FerretDB       — color tone → slate-100, wordmark + glyph visible

Gates: tsc --noEmit clean · 149/149 vitest tests pass · vite build OK.
2026-04-29 14:21:12 +02:00
hatiyildiz
cea9621072 merge: bundle OpenTofu CLI in catalyst-api image; fix catalyst-system → catalyst namespace string 2026-04-29 14:08:36 +02:00
hatiyildiz
9b6c297dd8 fix(catalyst-api): bundle OpenTofu CLI in runtime image (pinned + checksum verified)
The previous image bundled the infra/hetzner/ .tf sources but not the tofu
binary itself, so every Launch failed with:

  tofu init: exec: "tofu": executable file not found in $PATH

Add a dedicated builder stage that downloads OpenTofu v1.11.6 from the
canonical GitHub release, verifies the SHA256 against the upstream
SHA256SUMS file before extraction, and ships the binary into the runtime
image at /usr/local/bin/tofu (mode 0755 so UID 65534 can exec it). The
stage branches on $TARGETARCH (amd64 / arm64) to keep multi-arch buildx
correct; both arch checksums are pinned as build args so version bumps
are an explicit two-line change.

Add a CI smoke step in catalyst-build.yaml's build-api job that runs
`tofu version` inside the freshly-built image and asserts the output
matches EXPECTED_TOFU_VERSION; failure fails the build. Also re-run with
`--user 65534:65534` to gate exec-as-non-root at build time. The prior
infra/hetzner/ presence smoke step is preserved unchanged.

Sibling fix in ProvisionPage's FailureCard: the kubectl hint pointed at
namespace `catalyst-system`, but catalyst-api actually runs in namespace
`catalyst` (per chart/templates/api-deployment.yaml + live cluster).
Replace the namespace literal so the diagnostic command copy-pastes
correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:08:03 +02:00
github-actions[bot]
5e3cd1efbe deploy: update catalyst images to 80db0da 2026-04-29 11:56:13 +00:00
hatiyildiz
80db0da908 merge: contrast audit — restore theme tokens on ProvisionPage non-logo surfaces 2026-04-29 13:54:47 +02:00
hatiyildiz
6327d8db8b fix(wizard): contrast audit — restore theme tokens on non-logo surfaces
Provision page styled three surfaces with hardcoded
rgba(255,255,255,...) literals rather than the page's theme tokens.
The theme tokens (--s1, --md, --lo) already flip correctly under
.provision-shell[data-theme="light"], so any element painted with
the raw rgba was theme-locked to dark and washed out / invisible
against the light radial-gradient page background.

Three surfaces switched to tokens that already exist on the same
page and flip per-theme:

  • DAG bubble label fill (pending state) — colour
    rgba(255,255,255,0.45) → var(--lo)
    Dark: --lo = rgba(255,255,255,0.40) (≈ same)
    Light: --lo = #475569 (slate-600, readable on light bg)

  • Live-log info-line text — color rgba(255,255,255,.78)
    → var(--md)
    Dark: --md = rgba(255,255,255,0.65)
    Light: --md = #334155 (readable on light log panel)

  • Live-log meta pill + failure-card hint <code> background —
    rgba(255,255,255,.04) → var(--s1)
    Dark: --s1 = rgba(255,255,255,0.04) (unchanged)
    Light: --s1 = #fff (lifted pill on slate page bg)

The wizard StepReview surfaces (Section / Field / RegionCard /
ComponentMiniCard) and the marketplace family/product pages were
already migrated off raw rgba in 4f6dd10a; logo TILES intentionally
keep rgba(255,255,255,0.96) per the documented contract in
StepComponents.tsx LOGO_TILE_BG (vendored brand marks render in
mixed treatments — dark glyphs designed for white backdrops, white
glyphs on transparent — and a near-white pill keeps every glyph
legible regardless of theme).

Verification:
  • npx tsc --noEmit                                       ✓
  • npm run build                                          ✓
  • ./node_modules/.bin/vitest run — 149 passed (149)      ✓
  • Live wizard at /sovereign/wizard — every step's section
    surfaces and card surfaces render with proper contrast in
    BOTH dark and light themes; logo tiles still readable.
  • Live marketplace at /sovereign/marketplace/family/cortex
    and /sovereign/marketplace/product/axon — flat-section
    layout intact, logo tiles crisp.

No layout, no test selectors, no router, no componentGroups.ts,
no providerSizes.ts changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:54:03 +02:00
hatiyildiz
5520e91443 merge: review components — drop family summary, pixel-match marketplace.openova.io/review 2026-04-29 13:49:07 +02:00
hatiyildiz
4f6dd10a20 fix(wizard): review components — drop family summary, pixel-match marketplace.openova.io/review
The Components section on StepReview rendered both a family-summary
mini-card grid (PILOT M5 / SPINE M5 R1 O1 / …) AND a per-component
card grid below. The summary was a duplicate read of the same data —
each per-component card already shows its family chip, so the strip
above counted what the cards already display. Drop it.

The per-component cards themselves were tiny `auto-fill,
minmax(180px, 1fr)` chips with logo + name + tier letter + family
chip. Replace with a pixel-mirror of the canonical `.stack-card` on
https://marketplace.openova.io/review/ — same horizontal flex
layout, 40×40 logo tile, semibold name, low-key category pill, and
single-line description. Tokens map 1:1 (light theme):

  marketplace `--color-bg`            → wizard `--wiz-bg-input`
  marketplace `--color-border`        → wizard `--wiz-border`
  marketplace `--color-text-strong`   → wizard `--wiz-text-hi`
  marketplace `--color-text-dim`      → wizard `--wiz-text-md` (desc),
                                                 `--wiz-text-sub` (cat)

Card geometry verified pixel-identical to marketplace at 1440px
width: padding 10.4px, gap 10.4px, border-radius 8px, card height
66.078125px, 2-column grid with 8px gap collapsing to 1 column under
700px. Tier (M/R/O) intentionally dropped — not on the canonical
card; the Components step before review already enforces tier
semantics. The legend below the grid goes with it.

Section + Field shells switched from `--wiz-bg-xs` to `--wiz-bg-sub`
so the card surfaces lift visibly off the section background in
light mode — the previous near-white tint was the same colour as the
cards, so cards visually melted into the section ("white-on-white").

Verification:
  • npx tsc --noEmit                                       ✓
  • npm run build                                          ✓
  • ./node_modules/.bin/vitest run — 149 passed (149)      ✓
  • Live wizard at /sovereign/wizard step 7 — components section
    renders 2-col grid of stack-card-shaped components, no family
    summary, no tier legend, computed CSS matches marketplace.

POST body to /v1/deployments unchanged. componentGroups.ts,
provider/topology cards, router.tsx untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 13:48:28 +02:00
github-actions[bot]
e62fd5f3eb deploy: update catalyst images to 7931f79 2026-04-29 11:46:08 +00:00
hatiyildiz
7931f79ac4 merge: bundle infra/hetzner/ tofu module into catalyst-api image 2026-04-29 13:44:50 +02:00
hatiyildiz
61c6122633 fix(catalyst-api): bundle infra/hetzner/ tofu module into the image
The catalyst-api Pod is the OpenTofu runner — provisioner.New() reads
CATALYST_TOFU_MODULE_PATH (default /infra/hetzner) and stageModule()
copies the canonical .tf / .tftpl files into a per-deployment workdir
on every Launch. The previous Containerfile did not COPY the module
in, so every Launch failed:

    {"level":"ERROR","msg":"provision failed",
     "err":"stage tofu module: open /infra/hetzner: no such file or directory"}

Containerfile changes
- Build context is now the public openova repo root (Containerfile
  paths COPY from products/catalyst/bootstrap/api/ explicitly).
- New `COPY infra/hetzner/ /infra/hetzner/` brings the FULL tree
  (main.tf, variables.tf, outputs.tf, versions.tf, cloudinit-*.tftpl,
  README.md) into the runtime image. The path /infra/hetzner/ matches
  provisioner.New()'s default and the catalyst-platform Helm chart's
  CATALYST_TOFU_MODULE_PATH override.

Workflow changes (.github/workflows/catalyst-build.yaml, build-api job)
- context: openova-src/products/catalyst/bootstrap/api -> openova-src
  (the repo root is needed so infra/hetzner/ is in the build context).
- Split build into Build (load: true) + Smoke + Push, mirroring the UI
  job pattern. The smoke step runs `ls -la /infra/hetzner/` inside the
  built image and asserts main.tf, variables.tf, outputs.tf, versions.tf,
  and both cloudinit-*.tftpl files are present. Failure fails the build
  — broken images can no longer ship.

Verification (local)
- go vet ./... + go test ./... in products/catalyst/bootstrap/api: clean
- docker build -f products/catalyst/bootstrap/api/Containerfile . at the
  repo root succeeds; `docker run --rm --entrypoint sh catalyst-api:test
  -c 'ls -la /infra/hetzner/'` lists main.tf, variables.tf, outputs.tf,
  versions.tf, cloudinit-control-plane.tftpl, cloudinit-worker.tftpl.

provisioner.go business logic untouched. catalyst-platform Helm chart
api-deployment.yaml untouched (CATALYST_TOFU_MODULE_PATH already aligns
with /infra/hetzner).
2026-04-29 13:44:11 +02:00
github-actions[bot]
127398e969 deploy: update catalyst images to 36747a3 2026-04-29 11:39:01 +00:00
hatiyildiz
36747a3b26 merge: provision route invariant fix (use internal route id) 2026-04-29 13:38:00 +02:00
hatiyildiz
18d56ab8b8 fix(provision): use internal route id for useParams (basepath stripped)
The /provision/ route is registered against the router's
internal path; '/sovereign' is the basepath, stripped before matching.
The 'from: "/sovereign/provision/$deploymentId"' lookup matched no
route at runtime — TanStack Router throws 'Invariant failed' for any
useParams call against an unknown route id. Cast was hiding the type
error.

This unblocks the SPA route — /sovereign/provision/<id> now renders the
ProvisionPage without throwing.
2026-04-29 13:36:34 +02:00
github-actions[bot]
0745945eb8 deploy: update catalyst images to 4e5c75e 2026-04-29 11:17:59 +00:00
hatiyildiz
4e5c75e05c merge: provision as SPA route /sovereign/provision/:deploymentId; fix FQDN, components count, failure UX
# Conflicts:
#	products/catalyst/bootstrap/ui/src/pages/wizard/steps/StepReview.tsx
2026-04-29 13:16:53 +02:00
hatiyildiz
8f8d9c0d8a merge: dense multi-card review rows; per-component cards in Components 2026-04-29 13:15:41 +02:00
hatiyildiz
6a54782c7f merge: neutral high-contrast logo tile across cards, review, marketplace 2026-04-29 13:14:41 +02:00
hatiyildiz
08cd438762 fix(wizard): provision as SPA route /sovereign/provision/:deploymentId; fix FQDN, components count, failure UX
The provision page was a 1198-line static public/provision.html artefact
plus a sibling provision.js / catalog.js triple. The .html URL was the
visible give-away that the page wasn't first-class — it was rendered
outside the React app, did not share design tokens, did not get bundled,
and could not consume the wizard's zustand store directly. The result
was a page that displayed "omantel.omani-works · SOLO · 0 components ·
Failed" with no actionable detail when something went wrong.

This commit deletes all three static artefacts and ships a real SPA
route at `/sovereign/provision/$deploymentId` instead. Same DAG visual,
same EventSource wiring, same phase→bubble state machine — but as a
React component that:

- reads the deploymentId from URL params (deep-linkable, refresh-safe)
- reads selectedComponents + topology from useWizardStore directly
- resolves the FQDN via resolveSovereignDomain(store) — fixes the
  "omantel.omani-works" hyphen bug; the page now shows "omantel.omani.works"
- renders a real FailureCard when SSE surfaces status="failed", carrying
  the deployment's actual error message + Retry / Back-to-wizard CTAs
- handles 404 / EventSource error with a clean retry surface

Wiring:
- New /sovereign/provision/$deploymentId route in router.tsx
- StepReview's provision() callback now navigates via router.navigate
  instead of window.location.href = path('provision.html')
- BOOTSTRAP_KIT export added to catalog.generated.ts (read from
  clusters/_template/bootstrap-kit/ at build time, ordered by NN- prefix)
  so the React route can import the same source-of-truth the deleted
  catalog.js used to surface as window.CATALYST_CATALOG
- emitPublicCatalog() removed from build-catalog.mjs — no static page
  consumes it any more

Files deleted:
- public/provision.html
- public/provision.js
- public/catalog.js

Files added:
- src/pages/provision/ProvisionPage.tsx (1300+ lines: catalog read,
  expandWithDependencies, buildNodes, buildEdges, computeLayout,
  applyEvent state machine, sidebar, log panel, failure card, status
  pill)

Verified: tsc clean, 149/149 vitest tests pass.
2026-04-29 13:14:31 +02:00
hatiyildiz
9280cd4a4b fix(wizard): dense multi-card review rows; per-component cards in Components
Review page packs small fields/cards in horizontal rows instead of stacking
them top-to-bottom. The Components section now renders every selected
component as its own mini-card (logo + name + family chip + tier) so the
operator sees exactly what will be installed, not just family-level
counts. Reduced section padding and dropped redundant whitespace between
rows so the review fits a typical viewport without scrolling.

The provision()-to-/v1/deployments POST body is unchanged — visual only.
2026-04-29 13:10:41 +02:00
hatiyildiz
691467b486 fix(wizard): neutral high-contrast logo tile across cards, review, marketplace
Component-logos vendored under public/component-logos/ are upstream brand
marks rendered as-shipped — some are dark glyphs designed for white
backdrops, some are white glyphs on transparent (designed for dark
surfaces), some are full-colour. The previous tile (rgba(255,255,255,0.04)
with the icon-fallback using oklch hue rotation) made dark glyphs invisible
in dark mode and white glyphs invisible against the dim tile. Worse, the
contrast story was inconsistent across surfaces — the wizard cards, the
review page, and the marketplace family/product pages each picked their
own background.

This commit pins ONE tile contract used in every place a component logo
renders:

- background: rgba(255,255,255,0.96) (near-white pill, theme-independent)
- border-radius: 10px
- 1px outer border in --wiz-border-sub so the tile doesn't fight the card
- 6px internal padding so tight square SVGs aren't cropped
- IconFallback letter colour pinned to fixed slate (#0f172a) so the letter
  reads against the white tile in BOTH dark- and light-mode themes
  (--wiz-text-hi flips with the theme and would white-out in dark mode)

Files updated:
- StepComponents.tsx — .corp-comp-logo + IconFallback
- MarketplaceFamilyPage.tsx — .mp-related-logo + .mp-related-icon
- MarketplaceProductPage.tsx — .mp-product-logo + .mp-product-icon

Verified by toggling dark/light theme and walking the wizard +
marketplace pages — every brand mark legible regardless of glyph palette
or theme.
2026-04-29 13:09:37 +02:00
github-actions[bot]
676889d67c deploy: update catalyst images to 4149c44 2026-04-29 10:38:14 +00:00
hatiyildiz
4149c443e4 merge: 4-line card grid; 6-10 word professional descs; full-width text body 2026-04-29 12:36:38 +02:00
hatiyildiz
9af51d980e fix(wizard): 4-line card grid; 6-10 word descs; full-width text body
The wizard component cards were copying the SME marketplace's
`app-body { padding-right: 72px }` pattern, which reserves the right
quarter of every card for an absolute-positioned hover-only round Add
button. Combined with one- to three-word `desc` strings, every card
showed a name, a chip line, a single half-line of description, and a
visually empty right column — a quarter of valuable space wasted.

This change restructures the cards around a rigid 4-line grid that
spans the FULL body width:

  Line 1 — name (left, flex) + family chip + inline toggle (right)
  Line 2 — description line 1 (full width)
  Line 3 — description line 2 (full width, two-line clamp)
  Line 4 — tier chip + dependency chips + SELECTED dot (right)

Chips appear ONLY on line 1 or line 4, never on lines 2-3. The
`.corp-comp-body` no longer reserves any horizontal padding for
overlay buttons; descriptions use the entire body column.

The toggle affordance is relocated from an absolute-positioned 32×32
overlay (top-right of the card, opacity-0 until hover) to an inline
22×22 round button at the trailing edge of line 1, sharing the chip
row with the family chip. It still fades in on card hover and stays
visible when in-cart, but it occupies a single inline cell instead of
reserving a vertical column.

The bottom-right SELECTED text pill is replaced by a compact green
dot anchored to the right end of line 4. The card already conveys
selection through its green border, green-tinted background, and the
green ✓ toggle button on line 1; the loud text pill duplicated those
signals while crowding the dependency chips on cards with deps.

Every component description in `componentGroups.ts` is rewritten as a
6-10 word professional sentence-fragment distilled from the long-form
`COMPONENT_COPY.positioning` text in `marketplaceCopy.ts`. Same voice:
factual, technical, terse — no hype, no forbidden vocabulary.

Five before/after samples:
  flux:        "GitOps delivery engine"           → "GitOps reconciler driving every Sovereign cluster from Git"
  cilium:      "CNI & eBPF service mesh"          → "eBPF CNI and service mesh with kernel-level policy"
  cert-manager:"TLS certificate automation"       → "Automated TLS issuance and rotation for every ingress"
  grafana:     "Dashboards & alerting"            → "Curated dashboards across metrics, logs, and traces"
  langfuse:    "LLM observability & tracing"      → "Prompt, completion, and cost tracing for the AI plane"

All 63 component descriptions verified within 6-10 words; no
forbidden vocabulary ("MVP", "for now", "stub", "iterative", "demo");
no marketing fluff. CSS changes preserve the canonical 108px resting
height; tablet/mobile responsive floor unchanged. All 149 vitest
specs continue to pass; existing data-testid selectors
(`toggle-<id>`, `family-chip-<id>`, `tier-<id>`, `selected-<id>`,
`deps-<id>-<dep>`, `includes-<id>`, `component-card-<id>`) are
preserved unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:35:57 +02:00
hatiyildiz
570147cd8f merge: canonical SKU catalogs (Hetzner CPX32 recommended; Huawei c7n.xlarge.2; OCI E5.Flex.2.16; AWS m6i.xlarge; Azure D4s_v5) 2026-04-29 12:32:08 +02:00
hatiyildiz
183c3066f2 fix(wizard): canonical SKU catalog from each provider's pricing page (Hetzner, Huawei, OCI, AWS, Azure)
Replaces the guessed per-provider SKU catalog with values that match what
each cloud provider publishes on its canonical pricing page today
(snapshot 2026-04-29). Confused CX (Intel) vs CPX (AMD) vs CAX (ARM) vs
CCX (dedicated) labels are gone — each id, label, vCPU/RAM/disk spec, and
EUR price now comes from the source pricing page directly.

Hetzner   (19 SKUs): full CX23/33/43/53 (Intel), CPX22/32/42/52/62 (AMD),
                     CAX11/21/31/41 (ARM), CCX13/23/33/43/53/63 (dedicated).
                     Recommended: CPX32 — 4 vCPU AMD / 8 GB / 160 GB SSD,
                     €0.0232/hr €14.49/mo (founder-stated EU starter).
                     Sources: hetzner.com/cloud/regular-performance,
                     /cost-optimized, /general-purpose.
Huawei    (11 SKUs): s7 / c7n / m7 families across 2/4/8/16 vCPU sizes.
                     Recommended: c7n.xlarge.2 (4 vCPU / 8 GB).
                     Source: huaweicloud.com/intl/en-us/product/ecs/pricing.html
                     (specs cross-checked on Cloud Mercato).
OCI       (11 SKUs): VM.Standard.E5.Flex (AMD Genoa), .E4.Flex (Milan),
                     .Standard3.Flex (Intel), .A1.Flex (Ampere ARM).
                     Recommended: VM.Standard.E5.Flex (2 OCPU / 16 GB).
                     Source: oracle.com/cloud/compute/pricing/
                     ($0.030/OCPU + $0.002/GB AMD; $0.010/OCPU ARM).
AWS       (15 SKUs): m6i / c6i / r6i (Intel Ice Lake) plus m7g (Graviton3
                     ARM) at .large/.xlarge/.2xlarge/.4xlarge.
                     Recommended: m6i.xlarge (4 vCPU / 16 GB).
                     Source: aws.amazon.com/ec2/pricing/on-demand/
                     (us-east-1 Linux on-demand, verified on Vantage).
Azure     (10 SKUs): Dsv5 / Esv5 / Dpsv5 v5 generation (Intel + Ampere ARM)
                     at 2/4/8/16 vCPU sizes.
                     Recommended: Standard_D4s_v5 (4 vCPU / 16 GB).
                     Source: azure.microsoft.com/en-us/pricing/details/
                     virtual-machines/linux/ (West Europe, verified on Vantage).

NodeSize interface gains `disk: number | string` (local SSD GB or
"EBS-only"/"Variable") and `priceMonth: number` (Hetzner cap; hyperscaler
hour×730). USD list prices converted to EUR at 1 USD = 0.92 EUR (snapshot
2026-04, applied once at table-build time via priceUSDtoEUR helper).

StepProvider sublabel now renders disk + monthly cap alongside vCPU/RAM/
hourly. Stale comment references to "cx32"/"cx42" updated to "CPX32" (the
canonical Hetzner page calls it CPX32, never "CX32 — Standard").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 12:31:32 +02:00
hatiyildiz
3864eef4e7 docs(reconcile-pass-2): align docs with ground truth at 6afdb303
- Wizard step canonical order updated to Org → Topology → Provider →
  Credentials → Components → Domain → Review (RUNBOOK-PROVISIONING,
  DEMO-RUNBOOK, IMPLEMENTATION-STATUS); SKU pickers cross-ref the
  PROVIDER_NODE_SIZES per-provider catalog (#176).
- StepComponents UX rewritten: single flat marketplace card grid with
  family chips + product/family routes, two tabs (Choose Your Stack +
  Always Included) — replaces the prior "two-tab Mandatory infra/Apps"
  + "grouped by product header" prose (PRODUCT-FAMILIES, RUNBOOK-
  PROVISIONING, DEMO-RUNBOOK, COMPONENT-LOGOS).
- CORTEX familyDependencies = [] reflected in PRODUCT-FAMILIES; the
  Specter / BGE cascade narratives rewritten to component-level-only
  resolution (langfuse → cnpg, librechat → ferretdb → cnpg) — fixes
  the "selecting Spector pulls entire FABRIC" over-broad claim.
- catalyst-api OpenTofu workdir realigned from /var/lib/catalyst/...
  to /tmp/catalyst/tofu/<fqdn>/ via CATALYST_TOFU_WORKDIR env var
  (commit 27527e4c) — fixes runtime drift in RUNBOOK-PROVISIONING,
  SOVEREIGN-PROVISIONING, DEMO-RUNBOOK; DEMO-RUNBOOK kubectl exec
  ns corrected from catalyst-system to catalyst.
- Logo asset story rewritten: 58 logos (44 SVG + 14 PNG) sourced from
  CNCF artwork + project repos at #169b1d1c/#30ff318d, replacing the
  prior 62 stylised in-house marks; CI smoke-test (#6a7d2dd8)
  cross-referenced.
- 12 G2 bootstrap-kit charts (original 11 + bp-powerdns #167) aligned
  in PROVISIONING-PLAN Group F + blueprint-release.yaml comment +
  SOVEREIGN-PROVISIONING header; previously stale at 11.
- README repo-structure note updated: 12-component bootstrap kit +
  axon + external-dns leaf chart are built; 45 platform / 4 product
  folders remain README-only (was: "every folder except axon").
- ORCHESTRATOR-STATE main-tip SHA advanced from dd578d1c6afdb303
  with one-line summary of the post-Pass-1 batch.
- VALIDATION-LOG: Reconcile Pass 2 entry appended (drift fixed across
  10 files; six-category rubric).

Reconcile Pass 2 against main @ 6afdb303 — 10 files patched plus
VALIDATION-LOG entry. Doc patches are landing first so the in-flight
wizard step-reorder branch will merge into a doc set that already
names the canonical order, avoiding a second drift round.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:48:57 +02:00
github-actions[bot]
c7ca789b7e deploy: update catalyst images to 8aee179 2026-04-29 09:48:20 +00:00
hatiyildiz
8aee17994b merge: topology before provider; per-provider SKU catalog; per-region sizing 2026-04-29 11:46:29 +02:00
hatiyildiz
4ee9e7dd6f fix(wizard): topology before provider; per-provider SKU catalog; per-region sizing
The wizard step order was inverted: it asked for the provider before the
topology, then put hetzner-only SKUs inside the topology step. Topology
decides how many regions exist; provider is a per-region property; SKU
vocabulary is per-provider (cx32 means nothing on Azure). Fixes all three.

New step order (WIZARD_STEPS + WizardPage STEPS): Org -> Topology ->
Provider -> Credentials -> Components -> Domain -> Review.

Per-provider SKU catalog at products/catalyst/bootstrap/ui/src/shared/
constants/providerSizes.ts replaces the legacy hetzner-only HETZNER_NODE_SIZES.
Five providers (hetzner, huawei, oci, aws, azure), each with realistic SKU
options drawn from that vendor's native instance-type vocabulary. Every
SKU read in the wizard goes through PROVIDER_NODE_SIZES[provider] -- no
SKU literal lives anywhere else.

StepProvider now renders one card per topology slot. Each card carries:
provider chooser, that provider's region picker, that provider's
control-plane SKU, that provider's worker SKU + count. Cost rollup sums
each region's (cp + worker*count) at its OWN provider's pricing, so a
mixed-cloud topology computes correctly.

StepTopology drops the SkuCard + NodeSizingPanel; it now captures only
the topology template, HA flag, and AIR-GAP add-on.

Per-region store fields (regionControlPlaneSizes, regionWorkerSizes,
regionWorkerCounts) replace the singular controlPlaneSize/workerSize/
workerCount as the canonical shape. Migration in store.merge() hydrates
the arrays from any persisted singular fields; the cx22 legacy default
is treated as "no selection" so a hetzner-only id never leaks into a
non-hetzner region.

Backend Request gains an optional Regions []RegionSpec field. Validate
mirrors Regions[0] into the legacy singular fields for the existing
solo-Hetzner writeTfvars path. infra/hetzner/variables.tf accepts the
list-of-objects shape; the for_each iteration that activates the rest
of the regions is the multi-region tofu wiring follow-up. Door open
structurally; no shape compromised.

Dead code removed: StepInfrastructure and shared/constants/hetzner.ts
(both orphaned, contained the only HETZNER_NODE_SIZES reference outside
the catalog).

Gates: tsc --noEmit, vite build, vitest (149 tests), go vet, go test
(provisioner + handler).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:44:33 +02:00
hatiyildiz
6afdb3038c merge: align marketplace family/product pages with canonical marketplace tokens 2026-04-29 11:39:07 +02:00
hatiyildiz
a44408e095 fix(wizard): align marketplace family/product pages with canonical marketplace tokens
Replaces the bespoke hero-card + ad-hoc spacing pattern in
MarketplaceFamilyPage and MarketplaceProductPage with a layout that
mirrors the canonical marketplace at https://marketplace.openova.io/apps/
(source: core/marketplace/src/components/AppDetail.svelte).

Tokens aligned:
  - H1 hero            1.5rem / 700 / --wiz-text-hi (canonical 24px / 700)
  - Section H2         1rem / 600 / --wiz-text-hi (canonical 16px / 600)
  - Subtitle           0.9rem / --wiz-text-sub (canonical 14.4px / dim)
  - Body paragraph     0.9rem / line-height 1.7 (canonical 14.4px / 1.7)
  - Bullets            0.85rem with 6px green dot bullet (canonical match)
  - Tier pills         0.62rem uppercase, family-tinted bg (mirrors
                       .detail-meta span, with light-mode WCAG override)
  - Member tile        36×36 logo + name + 2-line tagline + tier pill
                       (mirrors canonical .related-card)
  - Product hero       80×80 logo / centred body / right-aligned CTA
                       (mirrors canonical .detail-hero)
  - CTA buttons        0.6rem 1.4rem / radius 8 / 0.88rem / 600 (mirrors
                       canonical .detail-add)
  - Family chips       4px-radius accent-tinted, low-opacity (mirrors
                       canonical .detail-cat)
  - Dependency tiles   surface chip with mono name + dim group label
                       (mirrors canonical .detail-dependencies li)
  - Sections           flat, divided by 1px subtle border (mirrors
                       canonical .detail-section)
  - Hover state        border → accent + 1px lift (canonical match)

Removed:
  - Custom rounded-14px hero card with full background fill
  - Inline-style "made-up" right-arrow on member rows (replaced with
    actual component logos)
  - Stacked tier pill + button column (replaced with canonical's
    horizontal meta-row + right-aligned CTA pattern)
  - 1.05rem section h2 (canonical is 1rem)
  - 1.6 paragraph line-height (canonical is 1.7)

Forbidden words audit clean: no "MVP", "for now", "stub", "iterative",
"demo" in copy. Family palette colours preserved (sky/violet/amber/
emerald/rose/pink/indigo/cyan) — they are the canonical
brand-identification tier and align with the marketplace's role of
distinguishing platform families.

Tests: all 145 vitest cases pass; tsc --noEmit clean; vite build clean.
componentGroups.ts and StepComponents.tsx untouched per parallel-agent
ownership.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:38:28 +02:00
hatiyildiz
0b432dd711 fix(wizard): pixel-match component cards to canonical marketplace UX
The d3346441 family-chip refactor bumped the wizard component-card
height from 108px to 130px and added an always-visible "Add / Selected"
pill button at bottom-right with a 1.85rem padding-bottom carved into
the card body. That broke the documented "pixel-match SME marketplace"
contract — the corporate Catalyst wizard cards no longer matched
https://marketplace.openova.io/apps/.

Restore the canonical SME marketplace card surface:
  - card height: 130px → 108px (read-only path stays 108px, no special-case)
  - body padding: 0.4rem right + 1.85rem bottom → 4.5rem right (no bottom)
  - replace the bottom-right "Select / Selected" pill button with the
    canonical 32×32 round icon button at top-right (Plus → Check),
    opacity 0 by default, opacity 1 on card hover, always visible
    when in-cart (mirrors AppsStep.svelte .app-add-btn 1:1)
  - re-introduce the bottom-right SELECTED status pill (only when
    in-cart) — mirrors AppsStep.svelte .status-corner / .s-selected
  - render dependencies as one chip per dep ("+ DepName"), matching
    AppsStep's chip-dep pattern (replaces the single deps-count chip
    + extra paragraph that forced the height bloat)
  - keep the test/a11y `includes-<id>` paragraph but absolute-position
    it off-screen (sr-only) so layout stays at 108px

Affordance reconciliation (no card-height growth):
  - the entire card is now an anchor to /marketplace/product/<id>,
    matching SME's `<a href="/app?slug=X" class="app-card">` wrapper
  - the family chip nested inside is a `<Link>` to
    /marketplace/family/<id> with stopPropagation
  - the round +/✓ button stops propagation and toggles selection via
    the wizard store (data-testid=`toggle-<id>` preserved for tests)
  - all three navigation surfaces preserved: family chip → family
    portfolio, card body → product detail, +/✓ button → wizard store

Read-only Tab 2 ("Always Included") path unchanged behaviourally —
renders as a plain `<div>` (not a `<Link>`) so it stays inert.

All 145 vitest cases pass (including the 89 in StepComponents.test.tsx).
TypeScript clean. Production vite build clean.

Refs: docs/INVIOLABLE-PRINCIPLES.md #2 (never compromise quality —
the SME marketplace IS the proven shape; do not diverge from it).
2026-04-29 11:29:13 +02:00
github-actions[bot]
c0f3e63ffc deploy: update catalyst images to b0ec0c4 2026-04-29 08:51:14 +00:00
hatiyildiz
b0ec0c4300 merge: family chips, product detail, family portfolio routes 2026-04-29 10:49:22 +02:00
hatiyildiz
6a7d2dd89b ci(catalyst-build): align UI smoke-test asset list with canonical extensions
Agent 1 (#176 logos) sourced each component's official upstream brand
mark in whatever format the project itself publishes — most projects
ship SVG, but Grafana docs (loki/mimir/tempo), Aqua (trivy), Anchore
(syft-grype), the LangFuse repo, vLLM, Ntfy, FerretDB, OpenMeter,
Coraza, External-DNS, NetBird, and StrongSwan only publish PNG. The
old smoke test hard-asserted every spot-checked id resolved as
.svg, so the langfuse PNG broke the build.

Replaced the hardcoded extension loop with an explicit list of full
paths matching componentGroups.ts. Every entry mirrors the actual
logoUrl the wizard renders, so a missing or mis-named asset still
fails the build — but in lockstep with the data file, not against
a stale extension assumption.
2026-04-29 10:49:09 +02:00
hatiyildiz
d3346441d6 feat(wizard): family chips, product detail, family portfolio routes 2026-04-29 10:48:22 +02:00
hatiyildiz
c78041c518 merge: reorder wizard steps (domain after components), revamp review
# Conflicts:
#	products/catalyst/bootstrap/ui/src/pages/wizard/steps/StepReview.tsx
2026-04-29 10:42:24 +02:00
hatiyildiz
8aec6244c5 merge: worker SKU + count selector in topology step 2026-04-29 10:40:46 +02:00
hatiyildiz
60e403ae6b merge: dynamic DAG + SSE wiring on provision page 2026-04-29 10:40:43 +02:00
hatiyildiz
56519aef5f merge: dependency mapping audit (fixes Spector→FABRIC and other bogus edges)
# Conflicts:
#	products/catalyst/bootstrap/ui/src/pages/wizard/steps/componentGroups.ts
2026-04-29 10:40:41 +02:00
hatiyildiz
169b1d1c70 merge: original product logos (replaces stylized placeholders with canonical upstream marks) 2026-04-29 10:38:56 +02:00
hatiyildiz
30ff318d0d fix(wizard): use canonical upstream logos for component cards
Every platform-component card now renders the OFFICIAL upstream brand
mark instead of a stylized OpenOva placeholder. Logos are sourced from
the CNCF artwork repo and each project's own repository:

  Source                           Components
  ────────────────────────────────────────────────────────────────────
  cncf/artwork                     cert-manager, cilium, cnpg
                                   (cloudnativepg), crossplane, envoy,
                                   external-secrets (eso), falco,
                                   flux, harbor, keda, keycloak,
                                   knative, kserve, kyverno, litmus,
                                   opentelemetry, opentofu, sigstore,
                                   strimzi, vpa (kubernetes)
  Project repo                     alloy, clickhouse, debezium,
                                   ferretdb, frpc, gitea, grafana,
                                   iceberg, kserve, langfuse,
                                   librechat, livekit, loki, matrix,
                                   milvus, mimir, neo4j, netbird,
                                   ntfy, openbao, openmeter,
                                   opensearch, reloader, seaweedfs,
                                   stalwart, strongswan, stunner,
                                   superset, syft-grype, temporal,
                                   tempo, trivy, valkey, vcluster,
                                   velero, vllm, flink, coraza

44 components ship as SVG; 14 components whose upstream publishes only
PNG marks (Loki, Mimir, Tempo, Trivy, NetBird, ntfy, OpenMeter, vLLM,
Coraza, Ferret, Syft+Grype, External-DNS, strongSwan, LangFuse) ship
as `<id>.png` with an explicit `logoUrl` override.

Five components retain `logoUrl: null` (letter-mark fallback): the
existing PowerDNS plus BGE (a model-family identifier rather than a
branded product) and the OpenOva-internal Axon, Continuum, Specter
components whose brand marks are not yet finalized.

Card markup, `depends:`, and family flags are intentionally not
touched in this commit (handled by parallel agents).

Quality gates:
  - npx tsc --noEmit            green
  - npm run build               green
  - vitest StepComponents.test  90/90 passed
2026-04-29 10:34:29 +02:00
hatiyildiz
a02f33cec0 feat(wizard): dynamic DAG + SSE wiring on provision page
Drop the 1100-line static-mock provision.html in favour of a runtime-
generated DAG keyed off the wizard's persisted localStorage state and the
build-time blueprint catalog. Bubbles, edges, sub-progress, log routing
and final CTA are all computed from real backend data.

What is now dynamic:
- Hardcoded NODES/TOPO/EDGES/LOGS arrays gone — DAG is built from
  window.CATALYST_CATALOG (components + bootstrap-kit) and the wizard
  selection at page load.
- One Hetzner-infra supernode and one Flux-bootstrap supernode anchor the
  graph; bootstrap-kit Blueprints render in numeric install order; user
  selection from selectedComponents (with transitive HARD deps expanded
  via blueprint.depends) makes up the rest.
- EventSource wired to <BASE>api/v1/deployments/<id>/logs. Phase events
  drive bubble state transitions (tofu-init|tofu-plan run Hetzner-infra
  through 0→.30 progress; raw `tofu` lines parse hcloud_network/
  hcloud_firewall/hcloud_server/hcloud_load_balancer markers to advance
  the supernode's sub-progress; tofu-output finishes it; flux-bootstrap
  opens the second supernode).
- Raw stdout/stderr lines stream into the live-log panel and the active
  bubble's expandable detail (per-node accumulation, click any bubble to
  read its slice).
- On `event: done` with status=ready, surface "Open Console →" CTA
  pointing at result.consoleURL from the snapshot.
- Empty-state path renders a clean "no active deployment" view when the
  page is hit without a wizard session or deploymentId in localStorage.

Build-catalog change:
- scripts/build-catalog.mjs now also emits public/catalog.js setting
  window.CATALYST_CATALOG = { components, bootstrapKit }. bootstrapKit is
  read from clusters/_template/bootstrap-kit/ (numbered prefix → install
  order). Same scan as the typed catalog.generated.ts so both surfaces
  stay in lock-step.

Per-component states beyond flux-bootstrap are not yet emitted by
catalyst-api; nodeForPhase() already routes phase=bp-<slug> events onto
the matching bubble so wiring the Flux Kustomization watcher on the
backend lights up the rest with no further page work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:30:14 +02:00
hatiyildiz
5ab70abad9 feat(wizard): reorder steps (domain after components), revamp review
Order changes:
- StepOrg now scope: name, industry, size, HQ, compliance only (email
  + domain inputs removed)
- New step sequence: Org -> Provider -> Credentials -> Topology ->
  Components -> Domain -> Review
- StepDomain captures the admin contact email alongside the Sovereign
  FQDN; the email pairs naturally with the deployment's external
  surface (Let's Encrypt registration, completion notifications)
- WIZARD_STEPS labels updated to match the new flow

StepReview revamp — single source of truth for the POST body:
- Sections in order: Organisation, Cloud Provider, SSH Access,
  Topology (incl. workerSize / workerCount with defensive null-guard),
  Components, Domain (admin email lives here)
- Hetzner token + registrar token rendered as fixed-length mask plus
  character count, never plaintext (INVIOLABLE-PRINCIPLES.md #10)
- SSH source row distinguishes auto-generated vs. pasted; fingerprint
  truncated for readability
- Domain section explicitly shows the resolved FQDN and the chosen mode

Gates: tsc --noEmit clean, vite build green, vitest 146/146 pass,
dev server boots cleanly on /sovereign/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:27:34 +02:00
hatiyildiz
0b6bb3eaea fix(wizard): audit and correct component depends + family cascade
Operator-reported defect: "if I select spectoer it is brining the entire
fabric family as well, I dont think there is such depenency in reality".
The user is right — the cascade was over-broad.

Root cause: PRODUCTS['cortex'].familyDependencies = ['fabric']. With
CORTEX.cascadeOnMemberSelection = true, selecting any CORTEX member (or
Specter, whose component-level deps include CORTEX members) walked the
family graph and pulled every FABRIC à-la-carte member — Strimzi/Kafka,
Debezium/CDC, Flink, Temporal, ClickHouse, Iceberg, Superset — onto the
selection. Specter and the rest of CORTEX have no runtime dependency on
any of those workloads. The only real cross-family need is cnpg (for
LangFuse) and a Mongo-compatible store (for LibreChat). cnpg is already
mandatory via transitive promotion; the Mongo backend is satisfied by
FerretDB, which is now reached via the corrected component-level dep
(librechat → ferretdb → cnpg).

Changes (one line per change):
- componentGroups.ts: PRODUCTS.cortex.familyDependencies: ['fabric'] → []
- componentGroups.ts: librechat.dependencies: ['cnpg'] → ['ferretdb']
  (LibreChat speaks MongoDB, not PG; FerretDB cascades cnpg transitively)
- componentGroups.ts: grafana.dependencies: ['seaweedfs'] → []
  (Grafana the dashboard server uses SQLite/PG; only its companion stores
  Loki/Mimir/Tempo need object storage)
- StepComponents.test.tsx: regression test "selecting Specter does NOT
  auto-select the FABRIC family" + companion tests asserting CORTEX
  familyDependencies is empty, librechat → ferretdb, grafana has no deps,
  and addProduct(cortex) does not drag à-la-carte FABRIC members.

Verification:
- npm run test (vitest run): 150/150 pass on the worktree
- npx tsc --noEmit: clean
- npm run build: clean
- Live wizard probe (vite dev): addComponent('specter') yields 34 ids,
  zero of {strimzi, debezium, flink, temporal, clickhouse, iceberg,
  superset} present, full CORTEX family present, librechat → ferretdb
  cascade fires.

The CORTEX cascadeOnMemberSelection flag remains true (per issue #175
operator intent: "BGE alone doesn't have much meaning unless we have
Cortex"). FABRIC stays cascadeOnMemberSelection: false (à-la-carte). The
wizard now mirrors real-world component coupling: Specter brings only
the CORTEX runtime members it actually needs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:27:11 +02:00
hatiyildiz
b96a03a585 feat(wizard): worker SKU + count selector in topology step
Closes the wizard polish gap "Selecting the shapes of the worker
nodes should be there." StepInfrastructure had a worker SKU + count
selector but was never wired into WizardPage.STEPS — the user walks
through StepTopology, where no sizing controls existed.

Adds a NodeSizingPanel inside StepTopology that:
  • Renders control-plane and worker SKU cards from
    HETZNER_NODE_SIZES (single source of truth — no SKU duplication).
  • Exposes a worker-count stepper and editable spinbutton, clamped
    to the topology-aware floor (0 for solo, 3 for multi-region) and
    a ceiling of 6 to stay inside Hetzner's default project quota.
  • Shows the worker SKU grid only when count > 0.
  • Surfaces a hard validation error when count > 0 but workerSize
    is unset; gates the Topology step's Continue button on the same.

Updates the store's setTopology to seed the worker-count default at
topology-pick time (solo → 0, multi-region → max(current, 3)) so
users land on a sensible default and the existing partialize() rules
keep persisting controlPlaneSize / workerSize / workerCount across
sessions unchanged.

StepReview now renders three chips inside the Infrastructure section
(control plane, workers, compute-total cost rollup) so the SKU + count
choice is visible at launch time, alongside the per-region cards
that were already there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 10:25:48 +02:00
hatiyildiz
27527e4ca5 fix(catalyst-api): pin TOFU_WORKDIR to writable /tmp + raise cpu/mem caps
Launch failed instantly with "create workdir: mkdir /var/lib/catalyst:
permission denied". The catalyst-api Pod runs as UID 65534 with emptyDir
mounts only at /tmp and /home/nonroot — /var/lib was never writable, so
the provisioner.New() default for CATALYST_TOFU_WORKDIR
(/var/lib/catalyst/tofu) lost on the very first MkdirAll call.

Three coupled fixes:

- Set CATALYST_TOFU_WORKDIR=/tmp/catalyst/tofu so the per-deployment
  workdir tree lands in the existing /tmp emptyDir.
- Bump cpu limit 100m → 1000m, memory limit 64Mi → 1Gi. tofu init pulls
  ~80MB hcloud + ~30MB dynadot provider plugins; tofu plan/apply hold
  the state file in memory; 64Mi was always going to OOM on first init.
- Grow /tmp emptyDir sizeLimit 256Mi → 2Gi to fit the per-Sovereign
  subdirectory tree (provider binaries + state + plan output).

Manifest-only change — Flux reconciles, kubectl rollout swaps the Pod,
no image rebuild required.
2026-04-29 10:12:44 +02:00