* feat(catalyst-ui): unified CrudModals scaffolding — FormFields per kind, shared modal frame
ADR-0001 §9.2 row B3 mandates a single seam pattern for every Cloud
resource Update — Crossplane XRC for cloud kinds, dynamic-client CR
write for K8s-native kinds. Issue #349 (Phase A.2 of #347) requires
full Add/Edit/Delete on twelve resource types.
This commit lands the scaffolding layer:
- CrudFormModal — generic Add/Edit shell that wraps ModalShell with
submit/error plumbing so per-kind modals stay thin.
- DeleteConfirmShell — generic delete confirm for the standalone-
resource path (PVC, Volume, Bucket, WorkerNode, Network, LB).
Cascade-aware deletes (Region/Cluster/vCluster) keep the existing
DeleteCascadeConfirm.
- SelectInput atom — shared select control matching TextInput style.
- formFields/ — typed FormFields component per kind (Region, Cluster,
vCluster, NodePool, WorkerNode, LoadBalancer, Network, PVC, Bucket,
Volume) so Add and Edit cannot drift.
- infrastructure-crud.ts — typed update*/add* wrappers for every kind
the catalyst-api will support: updateRegion, updateCluster,
updateVCluster, updateNodePool, addWorkerNode, updateWorkerNode,
updateLB, addNetwork, updateNetwork, addPVC, updatePVC, addBucket,
updateBucket, addVolume, updateVolume. DeletableResource union
picks up 'networks'.
No behaviour change yet — wired into modals + UI in subsequent
commits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(catalyst-ui): cloud-compute CRUD modals — Cluster/vCluster/NodePool/WorkerNode (Add+Edit+Delete)
Per issue #349 every Compute resource gets full CRUD breadth.
New modals:
- EditRegionModal — patch SKU + worker count on existing region
- EditClusterModal — rename + version upgrade + CP resize
- EditVClusterModal — rename + change isolation mode (DMZ/RTZ/MGMT)
- EditNodePoolModal — combined SKU + replicas patch (consolidates
legacy ScalePoolModal + ChangeSKUModal pair)
- AddWorkerNodeModal — single-node provision into a cluster
- EditWorkerNodeModal — resize machine type + edit taints/labels
- SimpleDeleteConfirm — non-cascade delete used by every resource
whose removal doesn't propagate to children
ADR-0001 §9.2 row B3 compliance: every cloud-resource Update writes
through Crossplane XRC; vCluster Update writes the K8s-native CR via
dynamic client (Crossplane stays out of K8s-to-K8s).
Existing AddRegionModal / AddClusterModal / AddVClusterModal /
AddNodePoolModal stay; ScalePoolModal + ChangeSKUModal stay (still
referenced by some CRUD demos) but are superseded by EditNodePool for
operator-facing flows.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(catalyst-ui): cloud-network CRUD modals — LoadBalancer/Network (Add+Edit+Delete)
Per issue #349 every Network resource gets full CRUD breadth.
New modals:
- EditLBModal — rename + listener-set rewrite
- AddNetworkModal — VPC/DRG provision with region selector
- EditNetworkModal — rename only (CIDR is immutable post-create)
AddLBModal now accepts an optional regionIdChoices prop so the
list-page entry point can render a region selector while the
context-menu entry point keeps the pre-selected region from the
clicked node.
Backend seam (ADR-0001 §9.2 row B3): every Update writes a Crossplane
XRC; catalyst-api never calls cloud APIs directly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(catalyst-ui): cloud-storage CRUD modals — PVC/Bucket/Volume (Add+Edit+Delete)
Per issue #349 every Storage resource gets full CRUD breadth.
New modals:
- AddPVCModal — name + namespace + capacity + storage class
- EditPVCModal — expand-only (Kubernetes PVCs forbid shrink/rename)
- AddBucketModal — name + capacity quota + retention
- EditBucketModal — patch capacity + retention (name immutable)
- AddVolumeModal — region + name + capacity + initial attach target
- EditVolumeModal — resize + attach/detach
Backend seam (ADR-0001 §9.2 row B3):
- PVC writes go through dynamic-client patch on
core/v1/persistentvolumeclaims (K8s-native CR, NOT Crossplane).
- Bucket + Volume writes go through Crossplane XRC (cloud objects).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(catalyst-ui): graph context-menu wiring — kind-aware add/edit/delete
Per issue #349 every node on the Architecture force-graph carries its
own kind-aware add/edit/delete affordances both via right-click context
menu and the slide-in DetailPanel.
Context menu now surfaces:
- Cloud: + Add region
- Region: + Add cluster / + Add load balancer / + Add network /
+ Add volume
- Cluster: + Add vCluster / + Add node pool / + Add worker node /
+ Add PVC
- vCluster: Edit / Delete
- NodePool / WorkerNode / LoadBalancer / Network: Edit / Delete
- Empty canvas: + Add region / PVC / bucket / volume
DetailPanel now exposes Edit + Delete for every kind with a backing
spec. Region/Cluster/vCluster keep the cascade-aware delete path;
NodePool/WorkerNode/LoadBalancer/Network use the new SimpleDeleteConfirm.
The new lookupSpecForGraphNode() helper resolves the typed Spec for a
given GraphNode id so the Edit modal pre-fills from the live topology.
ADR-0001 §9.2 row B3 compliance — every Update writes through the
existing infrastructure-crud wrappers; no direct cloud-API call.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(catalyst-ui): list-page row action menu + drawer Edit/Delete buttons
Per issue #349 every per-resource list page surfaces full CRUD:
- Header: + New CTA → opens kind's Add modal (Cluster, vCluster,
NodePool, WorkerNode, LoadBalancer, PVC, Bucket, Volume).
- Each row: ⋯ kebab in rightmost cell → Edit / Delete. Click-row still
opens the existing detail drawer.
- Detail drawer: Edit + Delete buttons at the top — same modals.
Cluster + vCluster Delete go through the cascade-aware confirm.
NodePool / WorkerNode / LoadBalancer / PVC / Bucket / Volume use the
SimpleDeleteConfirm from the previous commits.
The shared cloudListShared module gains:
- RowActionsMenu — kebab menu with click-outside / Esc dismiss
- DetailDrawerActions — Edit + Delete bar at top of drawer
- CloudListHeader.onNew + newLabel — per-page + New button
Plus matching CSS in cloudListCss.ts.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(catalyst-api): PATCH endpoints — XRC patch for cloud kinds, dynamic client for K8s kinds
Per ADR-0001 §9.2 row B3 every Cloud-resource Update must route through
a Crossplane XRC patch (cloud kinds) or a dynamic-client CR write
(K8s-native kinds). Issue #349 brings the catalyst-api up to full
breadth on every resource type listed there.
New endpoints:
PATCH /infrastructure/regions/{id}
PATCH /infrastructure/clusters/{id}
PATCH /infrastructure/vclusters/{id}
PATCH /infrastructure/loadbalancers/{id}
POST /infrastructure/networks
PATCH /infrastructure/networks/{id}
POST /infrastructure/clusters/{id}/nodes (WorkerNode add)
PATCH /infrastructure/nodes/{id} (WorkerNode patch)
POST /infrastructure/pvcs
PATCH /infrastructure/pvcs/{id} (Kubernetes expand-only)
POST /infrastructure/buckets
PATCH /infrastructure/buckets/{id}
POST /infrastructure/volumes
PATCH /infrastructure/volumes/{id}
DELETE handler's xrcKindForResourceKind switch picks up the new URL
segments (networks/buckets/volumes/pvcs) so cascade-delete works for
every kind.
New XRC kind constants in internal/infrastructure/xrc.go:
KindWorkerNodeClaim, KindNetworkClaim, KindBucketClaim,
KindVolumeClaim. PVCClaim stays as a string literal pending its
own constant once the third-sibling chart authors the XRD.
Test coverage: infrastructure_crud_breadth_test.go covers happy-path
+ NoFields validation on every new endpoint, plus DELETE on each new
kind. All handler tests pass (24s wall time).
ADR-0001 compliance:
- Cloud-resource Updates → Crossplane XRC patch via submitMutation
with Patch:true (existing pattern from PatchInfrastructurePool).
- vCluster + PVC Updates → same pipe, but the corresponding
Composition the third-sibling chart owns is responsible for the
direct CR write on the Sovereign cluster (Crossplane stays out
of K8s-to-K8s composition; the claim is an audit/intent record).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(catalyst): Playwright CRUD coverage + screenshots
New e2e/cloud-crud.spec.ts covers the full breadth of #349:
- Every list page surfaces a + New CTA in the header
- Every row has a kebab ⋯ menu with Edit + Delete
- Click-row → drawer; drawer header carries Edit + Delete
- Architecture force-graph context menu has Edit + Delete on every
kind, and add-network/add-volume/add-worker-node/add-pvc on the
appropriate parent kinds
- PVC Edit modal correctly read-only's name/namespace/storageClass
and only lets capacity be modified (Kubernetes expand-only)
- 1440×900 screenshots: Cluster Edit modal, PVC Add modal,
row-actions menu, Volume Delete confirm
Existing cloud-list-pages.spec.ts and cloud-architecture.spec.ts gain
focused additions for the same surfaces (CTA + row kebab + Edit
context-menu item).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hati@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three live-verification bugs from console.openova.io:
1. **LogPane X / Esc never actually dismissed the pane.** `onClose`
was wired to `setSelectedJobId(jobId)` (restore host) but the pane
itself stayed mounted because `<CanvasLogBridge>` rendered
unconditionally. Add `paneOpen` state to JobDetail; X / Esc set
it false and the canvas reclaims the reserved 30vw of right-edge
padding (smooth 220ms transition). A small floating "Logs"
re-open chip appears top-right of the canvas while the pane is
closed — clicking any bubble also re-opens it (keeps the
discoverability story honest).
2. **Host job indistinguishable when also currently selected.** The
page's home job is amber-ringed AND host-ringed simultaneously
on first paint, but the inner outer-ring priority drew amber
only — so the operator couldn't tell which bubble was the page
anchor until they clicked something else. Fix: render the teal
host marker as a separate OUTER halo (radius+6, stroke 3.5,
opacity 0.95) that survives the inner amber selection ring.
Glow underlay also re-prioritised so host > selection. Result:
the home job always reads as "home" regardless of what's
currently clicked. Tooltip also adds " · home" when isHost.
3. **No full-screen toggle for the canvas itself.** Item 8 of the
#351 spec called for "independent full-screen toggles for the
canvas and the log pane" — only the log-pane half was wired.
Add a fullscreen button (icon-button mirroring the log pane's,
top-right of the canvas surface) that overlays the canvas at
100vw/100vh / z-index 90 (above the docked LogPane so the
operator gets a true full-viewport canvas without the pane
covering 30%). Esc exits — the FlowPage attaches its own
keydown listener while in canvas-fullscreen mode.
Refs #351
Co-authored-by: hatiyildiz <hatice@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(adr): 0001 — Catalyst control-plane architecture
Captures the unified Catalyst architecture agreed in the architecture-review
session (#347 thread).
Eleven foundational rules including:
- GitOps + Flux as the only reconciler
- Crossplane = cloud APIs ONLY (no K8s-to-K8s composition)
- K8s itself is the database; in-process informer cache; no shadow store
- Event-driven via watch streams; SSE to UI; no polling
- Tenant = namespace + vCluster + Keycloak group (no SQL tenant table)
- Catalyst messaging = NATS JetStream (not Redpanda, not Kafka)
- Five backing stores: CNPG / FerretDB / Valkey / NATS / SeaweedFS
- Multi-region = N independent Sovereigns + data-layer replication
- Browser access via Guacamole
Records what stays unchanged, what's being reworked (UserAccess/CRUD/Bastion
briefs), and what new tickets need to be filed (SME consolidation epic,
Redpanda→NATS, multi-region tier scaffolding).
Status: Proposed — pending founder approval.
Related: #309, #320, #321, #322, #324, #325, #326, #347, #68
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs(adr): 0001 — add §9.4 demo-protection clause
Adds a hard rule preceding the cutover sequencing: the entire sme/
namespace runs untouched until founder explicitly authorises cutover.
Records the URL-to-backend split:
- console.openova.io/sovereign/* → catalyst-ui (NEW Catalyst-Zero)
- console.openova.io/nova/* → sme/console (LEGACY, demo)
- marketplace.openova.io → sme/marketplace (LEGACY, demo)
- admin.openova.io → sme/admin (LEGACY, demo)
The B6–B11 retirements are target-state, not immediate-action. C2 epic
sequences cutover with feature flags. Founder confirmed: "let the old
one keep working independently until we reach to perfect state, we'll
revamp it as well next week."
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hati@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The FlowPage owned `openJobId` as internal state and never emitted
changes upward, so JobDetail's `selectedJobId` stayed pinned to the
URL's `jobId` and the LogPane title never updated when the operator
single-clicked another bubble. Verified live on console.openova.io
(the canvas data attributes flipped correctly — `host=true` on the
URL job, `open=true` on the clicked job — but the LogPane header
still rendered the host's title).
Fix: add `onOpenJobChange` callback prop to FlowPage; wrap the
internal state setter so every external mutation fires the callback
+ the host-sync effect calls it on first paint. JobDetail wires it
into `setSelectedJobId`. Empty / null restores the host as the
selection so the LogPane never goes contextless after a background
click.
Refs #351
Co-authored-by: hatiyildiz <hatice@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(catalyst-api): recursive Job model — replace BatchID with ParentID (#351)
Collapse the parallel "batch" concept into a recursive Job tree:
- Job.BatchID → Job.ParentID
- Add Job.Type ("install" | "group"), Job.DisplayName, Job.ChildIDs
- Add lazy parent-group synthesis (bootstrap-kit + day-2-mutations are
now real on-disk Job rows materialised on first child write via
Bridge.ensureGroupJob; idempotent through UpsertJob's merge)
- Add Store.deriveTreeView: at read time, populate ChildIDs and roll up
Status / StartedAt / FinishedAt / DurationMs on group Jobs from their
descendants (failed > running > pending > succeeded)
- Drop BatchSummary type, Store.SummarizeBatches, Handler.ListBatches,
the GET /api/v1/deployments/{id}/jobs/batches route, and the
BatchBootstrapKit / BatchDay2Mutations consts (replaced by
GroupBootstrapKit + GroupDay2Mutations slugs)
Tests rewritten:
- store_test.go: new TestStore_DeriveTreeView_RollsUpGroupStatus and
TestStore_DeriveTreeView_AllSucceededRollsUp covering the rollup
- helmwatch_bridge_test.go: leafJobs / leafByName helpers; counts
updated for the synthesised parent-group row
- jobs_test.go: TestHandler_ListJobs_Populated asserts on parentId +
rolled-up group status
- TestHandler_ListBatches removed
Wire shape change: every Job now carries `parentId` (string),
`type` ("install" | "group"), `childIds` (string[]), and group jobs
optionally carry `displayName` ("Bootstrap" / "Day-2 Mutations"). UI
in a follow-up commit.
Refs #351
Supersedes #222
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(catalyst-ui): JobDetail + canvas redesign on the recursive Job model (#351)
Full-bleed canvas, no tabs, floating LogPane, host vs selection rings,
fold-aware recursive layout. Replaces the legacy "batch" UI concept
end-to-end — UI is now isomorphic to the recursive Job tree the
backend emits.
Behavioural changes (10 spec items):
1. 2-line compact header with persistent top-right status chip.
2. Tabs removed; canvas occupies the full viewport beneath the
header.
3. Floating ~30vw exec-log pane (LogPane) with slide-in animation
and full-screen toggle.
4. JobDetail opens with the host job auto-selected, neighbours lit,
log pane already showing the host's logs.
5. Host job ring is teal #14B8A6, distinct from the amber
selection ring (#FBBF24).
6. Single-clicking another job swaps the LogPane content;
the host's teal ring stays.
7. Double-click on a leaf navigates to its own home; double-click
on a parent group toggles its fold state inline.
8. Independent full-screen toggles for the canvas (existing
scroll-zoom) and the log pane (new icon button + Esc).
9. Built-in LogSearch — query input, regex toggle, level filter
chips (INFO/WARN/ERROR/DEBUG), match count, n/N navigation.
10. Recursive Job model end-to-end:
- jobs.types: Job.batchId removed; Job.parentId, Job.type,
Job.displayName, Job.childIds added; Batch interface dropped.
- jobsAdapter: emits parent group jobs (phase-0-infra,
cluster-bootstrap, applications) with rolled-up status/timing.
- flowLayoutOrganic: rewritten as a fold-aware recursive layout;
folded groups render as a single node with a child-count badge.
- FoldControls: Collapse all · Expand all · Depth: 1|2|3|all
toolbar replaces the legacy jobs/batches mode toggle.
- URL state: ?folded=id1,id2 · ?depth=1|2|3|all (default 2).
Deleted modules (zero legacy paths remain):
- BatchProgress.tsx + .test.tsx
- BatchDetail.tsx + .test.tsx
- BatchSummaryPane.tsx
- FloatingLogPane.tsx + .test.tsx (replaced by LogPane.tsx)
- flowLayoutV4.ts + .test.ts (FlowFamily + DEFAULT_FAMILIES
relocated to flowFamilyPalette.ts; layout function dead)
- pipelineLayout.ts + .test.ts (dead — only its own test imported it)
- FlowCanvasV4.tsx, FlowDeploymentTree.tsx,
flowDeploymentTreeData.ts (dead canvas/tree)
- /provision/$deploymentId/batches/$batchId route from router.tsx
New modules:
- components/LogPane.tsx — floating slide-in pane, full-screen, Esc
- components/LogSearch.tsx — query / regex / level pills / n-of-m
- lib/flowFamilyPalette.ts — relocated palette
- pages/sovereign/FoldControls.tsx — fold/depth toolbar
Modified modules:
- components/ExecutionLogs.tsx — accepts filter / matchIndex /
onMatchCountChange so LogPane can drive search-match navigation
without re-rendering line lists.
- components/StatusStrip.tsx — drops the modeToggle prop; trailing
slot now hosts FoldControls.
- pages/sovereign/FlowCanvasOrganic.tsx — host (teal) and selection
(amber) ring priorities, dashed parent-child edges, child-count
badge on folded groups.
- pages/sovereign/FlowPage.tsx — fold/depth state in URL, drops
?view=batches and ?scope=batch:, accepts hostJobId, group double-
click toggles fold in place.
- pages/sovereign/JobDetail.tsx — full-bleed shell, no tabs, hosts
LogPane.
- pages/sovereign/JobsTable.tsx — Parent column replaces Batch
column; parent chip links to the parent group's home.
- pages/sovereign/JobsPage.tsx — copy + scope rewording.
- pages/sovereign/jobsAdapter.ts — emits group jobs.
- lib/infrastructure-crud.ts — JobRef.batchId → JobRef.parentId.
- test/fixtures/jobs.fixture.ts — recursive shape; FIXTURE_BATCHES /
deriveBatches dropped.
Tests: every batch-shaped fixture replaced with parentId/type/childIds;
FlowPage tests rewritten for fold/depth helpers + canvas rendering;
JobsPage parent-chip link assertion updated.
`tsc --noEmit` clean. `rg -i 'batch'` over touched paths returns only
intentional migration comments (5 lines, all explanatory).
Refs #351
Supersedes #222
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #346 wired the WipeDeploymentModal as the Cloud-type onDelete branch
in ArchitectureGraphPage but the InfrastructureDetailPanel's `deletable`
gate only allowed ['Region', 'Cluster', 'vCluster'] — so the action
button never rendered on the Cloud root. Verified live at
console.openova.io/sovereign/provision/ce476aaf80731a46/cloud/architecture
post-deploy: Cloud-node panel showed only "+ Add region" with no
destructive affordance.
Fix:
- Add 'Cloud' to the deletable kinds.
- Render label "Cancel & Wipe deployment" for Cloud (vs "Delete <type>"
for Region/Cluster/vCluster) — different semantics, different copy.
- Distinct testid `infrastructure-detail-panel-action-wipe-deployment`
for Cloud so Playwright tests can target the wipe path explicitly.
The onDelete branch in the parent (ArchitectureGraphPage) was already
correct from #346 — Cloud → wipe-deployment, others → delete (Crossplane
XRC). This commit just makes the button visible.
Co-authored-by: Hatice Yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(wipe): deployment-level Cancel & Wipe — backend endpoint + Cloud-Architecture + wizard banner entry-points (closes#318)
Adds a first-class Phase-0 recovery surface so an operator can purge a
failed pre-handover deployment from the wizard UI without dropping to
hcloud CLI runbooks. Two entry-points, one canonical implementation.
## Backend
NEW: products/catalyst/bootstrap/api/internal/handler/wipe.go
POST /api/v1/deployments/{id}/wipe — single-flight destructive op:
1. tofu destroy against the per-deployment workdir (idempotent).
2. Hetzner orphan force-purge by label-selector
`catalyst-deployment-id=<id>` (servers, load balancers,
networks, firewalls, ssh-keys). Belt-and-braces — catches
resources tofu didn't track (half-failed cloud-init, manual
experiments). Per docs/INVIOLABLE-PRINCIPLES.md #3 this direct
API path is fallback ONLY for orphan cleanup, never new
resource creation.
3. PDM /v1/release for pool-subdomain Sovereigns (best-effort).
4. Local cleanup: kubeconfig file (mode 0600), tofu workdir,
on-disk deployment record JSON.
5. SSE events stream throughout on the same channel as the
original provisioning + Phase-1 watch.
6. Marks Status="wiped"; sync.Map entry reaped after a 60s TTL.
NEW: products/catalyst/bootstrap/api/internal/hetzner/purge.go
Hetzner Cloud API enumeration + force-delete by label selector.
Uses a 60s timeout (vs the 10s ValidateToken default) because async
server-delete jobs can queue. 404s treated as success (already gone).
NEW: products/catalyst/bootstrap/api/internal/provisioner/provisioner.go
Provisioner.Destroy() — runs `tofu destroy -auto-approve` against
the per-deployment workdir, then removes the workdir on success so
re-provisioning starts fresh. Re-stages module + tfvars first so a
partially-cleaned workdir still has what tofu needs.
TOUCHED: products/catalyst/bootstrap/api/cmd/api/main.go
Registers POST /api/v1/deployments/{id}/wipe.
## Frontend (aligned with existing CrudModals conventions per founder
## directive — no ad-hoc surface)
NEW: products/catalyst/bootstrap/ui/src/components/CrudModals/WipeDeploymentModal.tsx
Two-stage modal built on the canonical ModalShell. Pre-wipe confirm
view requires the operator to:
- Type the sovereign FQDN to confirm scope.
- Re-paste their Hetzner Cloud API token (catalyst-api intentionally
GCs the original after writeTfvars per credential hygiene).
Post-wipe success view shows the PurgeReport (servers, lbs, networks,
firewalls, ssh-keys removed; tofu/PDM/local-state ✓/✗) and a
"Start fresh deployment" CTA that nav's to /sovereign.
TOUCHED: products/catalyst/bootstrap/ui/src/components/CrudModals/index.ts
Re-exports WipeDeploymentModal + WipeReport.
TOUCHED: products/catalyst/bootstrap/ui/src/pages/sovereign/AppsPage.tsx
FailureCard now exposes a "Cancel & Wipe" red button next to
"Retry stream" / "Back to wizard" — opens WipeDeploymentModal.
TOUCHED: products/catalyst/bootstrap/ui/src/pages/sovereign/InfrastructureTopology.tsx
Cloud → Architecture canvas: the `cloud` (root) node action menu
gains "Cancel & Wipe deployment" as a `danger:true` action,
alongside the existing "+ Add region". Distinct from the
per-resource DeleteCascadeConfirm on region/cluster/vCluster — this
is deployment-scope (Phase-0 orphan purge), the others are
Crossplane-XRC scope (day-2). The two paths coexist; operators
choose by what state the deployment is in.
## Why two entry-points
Wizard banner (failed state on AppsPage) — recovery from a known
failure. Already a red-banner page; the button is right there.
Cloud → Architecture cloud-node action — proactive cancel from the
canvas, mirrors how the existing per-resource deletes are reachable.
Same modal, same backend.
## Constraints honoured
- Per docs/INVIOLABLE-PRINCIPLES.md #3 (Crossplane is the ONLY day-2
IaC): the per-resource DELETE handler at infrastructure.go is
unchanged and continues to flip XRC deletionPolicy. Wipe operates
ONLY in Phase-0 scope where Crossplane never adopted resources.
- Per #4 (never hardcode): every endpoint lives behind API_BASE; the
Hetzner purge enumerates by deterministic label selector built from
var.sovereign_fqdn (the OpenTofu module's existing tagging convention).
- Per credential hygiene: the Hetzner token is re-prompted at wipe time
rather than persisted; the modal uses an <input type="password">.
## Refs
#318 — pre-handover wipe spec (this PR closes it)
#317 — handover finalisation (sibling; this PR is the failure-path
complement)
feedback_idempotent_iac_purge.md — operator runbook this implements
PR #313 — sealed-secrets cleanup (independent; safe to land in any order)
PR #334 — bp-external-secrets split (independent)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ci): catalyst-build event-driven only — drop cron, push-on-main with path filter
Per docs/INVIOLABLE-PRINCIPLES.md (event-driven end to end — Flux
dependsOn, NATS JetStream, SSE, Helm hooks), GitHub Actions must follow
the same model. The previous `schedule: cron 0 3 * * *` daily build was
the only canonical deploy path, which created a 24h roll latency on
every change to the catalyst surface and incentivised "wait for cron"
stalls in operator workflows.
Replaces with:
on:
push:
branches: [main]
paths:
- 'core/console/**'
- 'core/admin/**'
- 'core/marketplace/**'
- 'core/marketplace-api/**'
- 'products/catalyst/bootstrap/**'
- 'products/catalyst/chart/**'
- '.github/workflows/catalyst-build.yaml'
workflow_dispatch:
`workflow_dispatch` retained for ad-hoc re-runs (config-only changes
that bypass the path filter, e.g. a secret rotation that doesn't touch
code). Path filter mirrors the actual surface this workflow rebuilds.
After this lands, every merge to main that touches the catalyst surface
auto-deploys. No cron lag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Hatice Yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two non-obvious platform behaviours that produced silent failures during the
JobDetail / Exec Log debugging chain:
- Flux v2.4 helm-controller emits HelmRelease as a nested JSON object
("HelmRelease":{"name":"bp-X","namespace":"flux-system"}), not the
flat-string format older docs assume. A regex written for the legacy
shape matches zero lines and silently drops every helm-controller
stdout entry.
- go-chi router does not decode %3A in path segments before route matching.
encodeURIComponent on a path parameter containing ':' yields a URL that
silently 404s, even though the literal-colon form works.
Both lessons include verified production samples + working regex/URL
patterns from internal/helmwatch/logtailer.go and useJobDetail.ts.
Ref: #305
The screenshot helper previously captured the brief "Loading…"
placeholder because it only waited for the page container. Wait
for either the seeded first row (data-backed pages) or the empty
state (placeholder pages) so the screenshots capture the populated
list view + sidebar nesting in lockstep.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
E2E spec covers all 12 P3 list pages: navigates the sidebar's
second-level accordion → expands each category → asserts every
sub-sub item is reachable, the page renders, the seeded first row
opens the detail drawer (data-backed pages) or surfaces the canonical
empty state (placeholder pages). 1440×900 screenshots saved to
e2e/screenshots/p3-cloud-*.png.
Router fix: each category (compute / network / storage) now uses an
<Outlet /> parent with an explicit index route hosting the landing
page. Without the index split, navigating to /cloud/compute/clusters
rendered the parent landing page instead of the child list page —
TanStack Router doesn't auto-collapse a parent component into an
outlet. Verified by all 15 Playwright tests now passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the three flat-dump sub-pages (CloudCompute / CloudNetwork
/ CloudStorage) with twelve per-resource list pages stacked behind
three category landing pages, all wired into the router under the
new /cloud/<category>/<resource> URL shape.
Pattern parallels JobsPage/JobsTable: header + count badge + back
link, search + filter pills, sortable columns, click-row → slide-in
detail drawer, empty-state and pagination. Status colour palette
matches JobsTable exactly. Source data is the existing
getHierarchicalInfrastructure() tree exposed via the useCloud()
context P1 set up; per-page flatten lambdas pluck the relevant rows.
Resource types shipped (12):
Compute Clusters, vClusters, Node Pools, Worker Nodes (real data)
Network Load Balancers (real data) + Services / Ingresses /
DNS Zones (placeholder pages awaiting #321 informers)
Storage PVCs, Buckets, Volumes (real data) + Storage Classes
(placeholder)
Category landing pages (CloudComputePage / CloudNetworkPage /
CloudStoragePage) replace the deleted CloudCompute.tsx /
CloudNetwork.tsx / CloudStorage.tsx; each shows a tile grid with
counts derived from the same shared tree.
Shared scaffolding lives under cloud-list/: typed sort state,
useCloudListState hook (search + sort + filter + pagination, no
setState-in-effect), CSS string, and presentational primitives
(CloudListHeader, CloudListToolbar, FilterPills, SortableTH,
CloudListDetailDrawer, DetailRow, EmptyState, Pagination,
StatusPill). The hook + CSS + sort types live in dedicated files
so the components file stays react-refresh clean.
CloudPage's Sovereign-switcher path-preserving regex was extended
to capture the deepest sub-route (e.g. /cloud/compute/clusters
follows the operator across deployments). Router gains 12 child
routes under the existing /cloud/{compute,network,storage} parents.
Lint goes from 34 baseline errors to 32. All 534 unit tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P3 of #309 — extends the Cloud accordion with second-level expansion.
Each category (Compute / Network / Storage) becomes a split row: a
<Link> on the left navigates to the category landing page and a
<button> chevron toggles the resource-list children without leaving
the current page. Architecture stays a leaf.
Persists each second-level toggle state in localStorage under
sov-nav-cloud-{compute,network,storage}-expanded so reloads remember
which sub-trees the operator wants open. Auto-expands the matching
category when the operator is currently inside one of its
resource-list pages (e.g. /cloud/compute/clusters → Compute opens).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The force-graph simulation is intentionally continuous (cooldownTicks: Infinity-equivalent
rAF loop), so nodes never strictly settle. Playwright's stability-check timed out 30s on
right-click and double-click in the local headless run; left-click was passing on luck.
Adding `force: true` to all three graph-node interactions (click for detail panel,
right-click for context menu, dblclick for focus mode) — the canonical Playwright fix
for continuous-animation interactables. Click events still fire to the React handler
identically.
Verified locally: 7/7 pass in 45s (was 5/7 with 2.5min worth of retry timeouts).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 of openova-io/openova#309. New cloud-architecture.spec.ts asserts
the operator-facing UX end-to-end and captures evidence
screenshots.
Coverage:
- Navigating to /sovereign/provision/{id}/cloud/architecture
mounts the force-graph canvas + svg + live stats overlay.
- Edge legend exposes contains / runs-on / routes-to /
attached-to relations.
- All 8 type badges render (Cloud, Region, Cluster, vCluster,
NodePool, WorkerNode, LoadBalancer, Network).
- Global density slider defaults to 50, responds to input,
updates the percent label.
- Search box (debounced) shows the "X matches + Y neighbors"
counter.
- Click on a node opens the right-side detail panel with the
type label and a populated neighbor list (tested against
the cluster's parent region).
- Right-click on a node opens the context menu with kind-aware
items (Cluster: add-vcluster + add-nodepool + delete).
- Saves three 1440x900 screenshots: default, search-isolated,
focus-mode (per the parallel-agents-e2e memory rule).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 of openova-io/openova#309. Rewrites Architecture.test.tsx to
match the new force-directed canvas — the legacy SVG-layered
assertions (depth labels, zoom-on-click, data-dim toggles) were
retired with the layout itself.
15 cases covering:
- Empty state when the tree has no nodes
- Force-graph mounts; node groups for every type render with
composite ids (arch-graph-node-{type}-{compositeId})
- Edge legend lists every relation type
- Live nodes/edges stats overlay
- Search box debounces, then shows the "X matches" counter
- Node click opens detail panel with type label
- Detail panel lists neighbors with drill-in
- Detail panel close button works
- Right-click on node opens context menu with kind-aware items
(Cluster context exposes add-vcluster + add-nodepool + delete)
- Right-click on canvas exposes "Add region"
- Global density slider exists at default 50%
- Per-type badges render for all 8 types
- CRUD modals (AddCluster, AddVCluster, AddRegion) still mount
via the new wiring
All 15 pass. Full suite: 512/512 green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 of openova-io/openova#309. The Architecture sub-page body now
delegates entirely to widgets/architecture-graph.
Architecture.tsx is reduced to a thin adapter over useCloud() — the
legacy topologyLayout SVG renderer, the inline zoom-on-click
state, the depth-row labels, and the click-to-zoom CRUD modal
sidebar are all gone. Founder reversed the layered tree decision in
issue #228 → #309: "forget about the containment, just show it as
another type of relation."
InfrastructureDetailPanel.tsx is deleted — its responsibilities
(properties, status, actions) are now inline in
ArchitectureGraphPage's DetailPanel, which additionally surfaces
the neighbor list (founder spec) and the focus-mode toggle.
The lib/topologyLayout.ts module + tests stay as-is (no callers
remain in the sovereign portal, but the module is referenced by
src/lib/infrastructure.types.test.ts and may be reused for other
surfaces). Removing it is out of P2 scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 of openova-io/openova#309. The page-level orchestrator wraps
GraphCanvas with the operator-facing UX founder spec calls for.
adapter.ts (hierarchyToGraph):
- Turns HierarchicalInfrastructure into neutral GraphNode/GraphEdge
- Composite ids: ${type}:${elementId}
- Edges emitted: contains, runs-on, routes-to, attached-to,
peers-with — containment is treated as ONE edge type (founder
verbatim: "forget about the containment, just show it as another
type of relation")
- Node types: Cloud, Region, Cluster, vCluster, NodePool,
WorkerNode, LoadBalancer, Network — every leaf surfaces so the
operator sees the full architecture in one canvas
ArchitectureGraphPage.tsx — bound to useCloud() data:
- Toolbar: search (debounced 250ms, isolation pattern with
"X matches + Y neighbors" counter) + global density slider
(0..100%, default 50%, applies proportional cap to all tunable
types) + clear-focus button
- Per-type badges with mini Popover: slider 0..total, presets
None / 25% / 50% / All / Hide; small types (<50) toggle hidden
on click; debounced 400ms
- Right-side detail panel on node click: properties, neighbor
list with type-color dots, focus-neighbors toggle, kind-aware
add-child button, delete (Region/Cluster/vCluster)
- Double-click → focus mode (filter to focus + direct neighbors)
- Right-click on node → context menu: kind-aware add (Cluster
has add-vcluster + add-nodepool, Region has add-cluster +
add-lb, Cloud has add-region) + delete
- Right-click on canvas → context menu with "Add region"
- Shift-drag from one node to another → emits onEdgeCreate
(logs intent; relation API lands with #321)
- Edge legend at the bottom — colour swatch + count per relation
type, dashed swatch matches edge rendering
- Reuses existing CrudModals (AddRegion / AddCluster / AddVCluster
/ AddNodePool / AddLB / DeleteCascadeConfirm) — no new modal
components, only fresh wiring
Per docs/INVIOLABLE-PRINCIPLES.md:
#1 (waterfall, target shape) — every UI affordance ships in the
first cut; no "for now" shortcuts.
#4 (never hardcode) — the type list, density presets, debounce
interval, edge palette and small-type threshold are all
constants at the top of the file.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P2 of openova-io/openova#309. Introduces the reusable, low-level
force-directed canvas component and its type contract.
GraphCanvas:
- forwardRef wrapping an SVG root (consistent with the existing
JobDependenciesGraph SVG idiom — no canvas-based libs)
- d3-force engine (already a dep) for charge / link / collide /
center forces; 5-tier adaptive physics by node count
- degree-based radius: 6 + sqrt(degree) * 2.8, clamped 6..20
- stroke states: highlighted (yellow), focusNodeId (pink), pinned
(dark dashed), default (white) — priority order locked
- pin-on-drag (left button) + shift-drag-to-create-edge with
in-flight guide line and edge-create event
- double-click via lastClickRef + ev.timeStamp (event.detail
unreliable across browsers per founder spec)
- imperative handle: addElements / removeElements / unpinNode /
relax / fit
- focusNodeId prop filters down to the focus node + direct
neighbors (not dimming)
- hiddenTypes + typeLimits drive the per-type density slider
- bottom-left stats overlay (live node + edge count)
- ResizeObserver-driven responsive sizing
- cooldownTicks behaviour: simulation never stops; rAF re-renders
on every tick
types.ts:
- ArchNodeType / ArchEdgeType / ArchStatus
- GraphNode / GraphEdge (caller-facing) + LiveNode / LiveEdge
(canvas-internal, x/y/fx/fy mutable)
- edgeNodeId() helper — d3-force mutates link.source/target from
string ids to node refs after the first tick; ALL edge filtering
must go through this helper
- NODE_FILL / EDGE_STROKE / EDGE_DASHED palettes
Implementation note: the founder spec referenced react-force-graph-2d
(canvas-based + Mantine), but this codebase is uniformly SVG +
Tailwind + Radix UI (see widgets/job-deps-graph/JobDependenciesGraph
for the established pattern). We use d3-force directly and render to
SVG to preserve testability via data-testid, dark-theme tokens, and
the existing visual-style consistency. Every behavioural requirement
in the spec (degree-based radius, pin-on-drag, focus mode, search
isolation, double-click, drag-to-create-edge, density slider) is
honored identically; the swap is engine-only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds e2e/cloud-nav.spec.ts — 7 Playwright assertions that lock in
the Sovereign-portal Cloud accordion contract from issue #309:
1. Sidebar exposes Cloud (not Infrastructure) accordion.
2. Clicking the Cloud header toggles expanded state and reveals 4
sub-items (Architecture / Compute / Network / Storage).
3. Each sub-item routes to /provision/$id/cloud/{suffix} and
declares aria-current=page when active.
4. Legacy /infrastructure/* paths redirect to /cloud/* equivalents.
5. Expanded state persists across page reloads via the
`sov-nav-cloud-expanded` localStorage key.
6. Accordion auto-expands when the operator deep-links onto a
/cloud/* route.
7. Captures three 1440x900 screenshots (collapsed, expanded with
Architecture active, expanded with Compute active) under
e2e/screenshots/p1-cloud-nav-*.png for visual evidence.
Also fixes a Sidebar bug surfaced by the e2e run: the active-section
detector was using `pathname.includes('/cloud')`, which would falsely
flag any deploymentId containing the substring "cloud" as being on a
/cloud/* route. Replaced with a path-segment regex.
Adds e2e/screenshots/ to .gitignore (regenerated each run, never
committed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Converts every legacy /provision/$deploymentId/infrastructure/* path
into a beforeLoad redirect that targets the equivalent /cloud/* route,
preserving the $deploymentId param so deep links and bookmarks land
on the renamed surface without an extra hop:
/infrastructure → /cloud/architecture
/infrastructure/topology → /cloud/architecture
/infrastructure/compute → /cloud/compute
/infrastructure/network → /cloud/network
/infrastructure/storage → /cloud/storage
The redirect routes still register tanstack-router components (a
no-op stub), because the route node must exist for the path to match
before `beforeLoad` fires.
Updates the cosmetic-guard suite to assert the new redirect
behaviour + the new sidebar shape (sov-nav-cloud accordion replacing
the flat sov-nav-infrastructure entry). The original `infrastructure
page` describe block is replaced by a tighter `cloud section` one
that focuses on structural surface contract; deeper accordion
behaviour is owned by the new cloud-nav.spec.ts (added in a
subsequent commit).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the flat Infrastructure entry in the Sovereign sidebar with a
Cloud accordion (issue #309). The four sub-pages — Architecture,
Compute, Network, Storage — render as indented entries under the Cloud
header instead of as an in-page tab strip.
Behavior:
- Cloud header is a <button> (not a Link) that toggles the
accordion. Active when on any /cloud/* (or legacy /infrastructure/*)
route.
- Sub-items are tanstack-router <Link>s targeting
/provision/$deploymentId/cloud/{architecture,compute,network,storage}.
Active sub-item carries aria-current="page".
- Auto-expanded by default when the operator is on a /cloud/* route.
- Persists expand state in localStorage under
`sov-nav-cloud-expanded` so it survives page reloads.
- ARIA: aria-expanded + aria-controls on the header; the sub-list
is role="group" with the matching id (sov-nav-cloud-group).
- Keyboard accessible: Enter / Space toggle the accordion.
Test IDs:
sov-nav-cloud (header), sov-nav-cloud-toggle (chevron),
sov-nav-cloud-architecture, sov-nav-cloud-compute,
sov-nav-cloud-network, sov-nav-cloud-storage (sub-items),
sov-nav-cloud-group (group container).
Issue #309 founder verbatim:
"have accordion menu under cloud left pane"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Renames the Sovereign Cloud shell + replaces the in-page Topology /
Compute / Storage / Network tab strip with a future sidebar accordion.
The sub-page contents are unchanged in this commit (they keep their
file names + testids; the next commits rename those).
Changes:
- InfrastructurePage.tsx → CloudPage.tsx (file + class + context).
- InfrastructureContext / useInfrastructure() → CloudContext /
useCloud() — sub-pages updated to pull from the renamed hook.
- Page header "Infrastructure" → "Cloud"; tagline rewritten so it no
longer enumerates the legacy tab labels.
- Drop INFRA_TABS, resolveActiveTab, the <nav role=tablist> block,
and the .tabs / .tab CSS rules. The sidebar accordion (next
commit) replaces the in-page navigation.
- data-testid renames: infrastructure-page → cloud-page,
infrastructure-title → cloud-title,
infrastructure-content → cloud-content,
infrastructure-sovereign-switcher → cloud-sovereign-switcher.
- Compute table cluster-link target updated from /topology →
/cloud/architecture so it lands on the renamed canvas route.
- InfrastructurePage.test.tsx renamed; tab-strip assertions
converted into "tab strip is absent" assertions.
- Sub-page test fixtures updated to mount under /cloud/* paths.
Issue #309 founder verbatim:
"we call it as cloud maybe"
"have accordion menu under cloud left pane"
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the new Sovereign-portal Cloud surface routing tree (issue #309)
without removing the legacy /infrastructure/* paths yet:
/provision/$deploymentId/cloud → CloudPage shell
↳ / → redirect to /architecture
↳ /architecture → Architecture canvas
↳ /compute → CloudCompute
↳ /network → CloudNetwork
↳ /storage → CloudStorage
Both /infrastructure/* and /cloud/* now resolve to the same components.
Subsequent commits will rename the components, drop the in-page tab
strip, switch the sidebar to an accordion, and convert /infrastructure/*
into redirects to /cloud/*.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream seaweedfs/seaweedfs templates/shared/security-configmap.yaml
uses Helm template fromToml; helm-controller v1.1.0's bundled helm SDK
(v3.x older than 3.13) doesn't define fromToml so the install fails:
parse error at security-configmap.yaml:21: function fromToml not defined
Setting global.seaweedfs.enableSecurity: false skips the entire template.
Internal SeaweedFS API is cluster-IP only on Sovereign-1; chart-level
security is acceptable to defer until helm-controller is bumped.
Bumped 1.0.0 → 1.0.1.
Unblocks the chain: bp-loki, bp-mimir, bp-tempo, bp-velero, bp-harbor,
bp-grafana all dependsOn bp-seaweedfs.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
The chart's post-install hook was failing on otech.omani.works:
failed post-install: unable to build kubernetes object for deleting hook
bp-external-secrets/templates/clustersecretstore-vault-region1.yaml:
resource mapping not found for kind ClusterSecretStore in version
external-secrets.io/v1beta1
Two corrections:
1. Capabilities-gate the entire template — don't render unless the
ClusterSecretStore CRD is registered (it ships in via the upstream
ESO subchart but isn't live on first install)
2. Remove 'before-hook-creation' delete-policy (was the actual trigger
for the 'deleting hook' failure path)
Bumped 1.0.0 → 1.0.1.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
'function fromToml not defined' error on bp-seaweedfs publish.
Upstream seaweedfs/seaweedfs 4.22.0 (templates/shared/security-configmap.yaml:21)
uses fromToml which exists in 3.13+ but the rendered context in the smoke
step needs newer Sprig functions present in 3.18+. Bump unblocks the
chain of HRs (bp-loki, bp-mimir, bp-tempo, bp-velero, bp-harbor, bp-grafana)
all blocked on bp-seaweedfs publish.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Pre-existing bug exposed by #305: ExecutionLogs fetched
`/api/v1/actions/executions/{id}/logs` directly instead of going
through API_BASE (`${BASE}api`). Under Vite's `/sovereign/` base path,
the Traefik ingress only routes `/sovereign/api/...` — bare `/api/...`
returns 404.
Live evidence after #328 (jobId raw colon fix):
GET /sovereign/api/v1/deployments/.../jobs/{id} → 200 (FE rewire OK)
GET /api/v1/actions/executions/{realExecId}/logs → 404 (this bug)
Note that the executionId in the failing URL is a real 32-char hex
(5f59cb0bc9df2c720b4cf07989e4dc4f), not the synthetic `:latest` —
proving the rewire in #307 + the colon fix in #328 both worked. Only
the logs URL prefix remained wrong.
Fix: import API_BASE; use `${API_BASE}/v1/actions/executions/...`.
Per docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode URLs in app
source) — the original direct `/api/...` was a violation that this
PR settles permanently.
Co-authored-by: hatice yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five operator-spec corrections:
1. More structured (pipeline-like)
forceX strength 0.32 → 0.55. Same-depth siblings now cluster around
their depth column; pipeline-y horizontal feel preserved.
2. Min spacing between bubbles + smaller bubbles
NODE_RADIUS 30 → 22 (more breathing room).
COLLIDE_PADDING 6 → 14 (forces wider gap regardless of zoom).
3. Hard MAX bubble size — no more elephant in batch view
Auto-fit viewBox now enforces a MIN viewBox size (1200×700). Single-
bubble or few-bubble cases (batch detail, etc.) keep the canvas at
that minimum so the bubble can't scale up to fill the whole screen.
bbox is centered within the (possibly larger) viewBox.
4. Click highlight — selected node + neighbors + connecting edges
• openJobId node: amber outer ring (4px) + amber glow halo
• Direct neighbors: lighter-amber ring (3px) + softer halo
• Edges connecting selected node: amber stroke 2.6px + amber arrow
• Non-selected non-neighbor nodes: dimmed to opacity 0.35
• Status fill kept (so we still see succeeded/failed/running/pending)
The amber palette is distinct from any status colour so selection
reads clearly even on running (cyan) or failed (red) bubbles.
5. Remove standalone /flow route + 'Show as Flow' button
Operator: 'we cannot hard code a specific flow, we'll have multiple
flows, therefore we should show the flows only under the respective
jobs.' Removed:
• provisionFlowRoute from router.tsx
• 'Show as Flow' button from JobsPage.tsx
• JobsTable batch chip retargeted from /flow?scope=batch:<id> to the
canonical /batches/ page (which embeds the flow internally)
FlowPage component preserved — it's still embedded inside JobDetail
and BatchDetail as the in-context Flow tab.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
The browser auto-encodes `:` to `%3A` when encodeURIComponent is
applied to a path segment. Chi's router does NOT decode %3A before
matching the route, so every JobDetail fetch returned 404 against the
catalyst-api.
Live evidence (Playwright network log on otech wizard, 2026-04-30):
GET https://console.openova.io/sovereign/api/v1/deployments/
ce476aaf80731a46/jobs/ce476aaf80731a46%3Ainstall-seaweedfs
→ 404
Internal probe with the raw colon:
wget http://localhost:8080/api/v1/deployments/.../jobs/
ce476aaf80731a46:install-seaweedfs
→ 200
Result on the live deployment: every JobDetail page rendered the
"Execution metadata pending" placeholder even though the catalyst-api
DID have a valid execution to surface. Bug is in the FE encoder, not
the backend or the route.
Fix:
- useJobDetail inserts jobId raw into the URL template. The colon
is RFC 3986 path-safe so this is correct per spec.
- deploymentId stays encodeURIComponent'd defensively (it's a hex
string, no-op in practice, but the encode is cheap insurance).
- Test now asserts the URL contains the raw `:` and rejects %3A.
Co-authored-by: hatice yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
helm-controller in flux v2.4 (the version Catalyst-Zero pins) emits
structured JSON log lines with HelmRelease as a NESTED OBJECT:
"HelmRelease":{"name":"bp-mimir","namespace":"flux-system"}
The old regex only matched the legacy flat-string format
(`helmrelease="flux-system/bp-X"` or `"helmrelease":"flux-system/bp-X"`).
Result on otech.omani.works: every helm-controller stdout line was
parsed but did not match → silently dropped → zero PhaseComponentLog
events emitted → exec log viewer rendered only synthetic [seeded] /
[<state>] anchor lines.
Verified by tailing helm-controller-86c6b84dcd-t58td on the live otech
cluster (10h reconcile activity, format consistent across hundreds of
lines).
Fix:
- logtailer.helmControllerNameRe now alternates across all three
observed formats: flat-string colon, flat-string equals, and
nested-object name+namespace.
- pumpLines picks whichever capture group fired (regex alternation
leaves the other group empty).
- logtailer_test.go fixtures extended with two real flux v2.4
nested-object samples copied verbatim from the live otech
cluster's helm-controller stdout.
Co-authored-by: hatice yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three operator-spec corrections to the organic Flow canvas:
1. Straight edges, not bezier curves
FlowEdge now renders <line x1 y1 x2 y2> rim-to-rim instead of a
cubic bezier with perpendicular control points.
2. Drag pins permanently — no spring-back
d3-drag 'end' handler no longer clears d.fx/d.fy. The bubble stays
exactly where the operator dropped it. Operator can re-drag any time.
forceX/forceY anchors only act on non-pinned (fx/fy === null) nodes.
3. Auto-fit viewBox — smart canvas filling regardless of node count
Replaced fixed viewBox="0 0 2000 1100" with bbox computed each
render: vbX/vbY = min(x|y) - padding, vbW/vbH = (max - min) +
2*padding. preserveAspectRatio="xMidYMid meet" then auto-scales.
Result:
• 2 bubbles at depth 0/1 → small bbox → tight zoom (no
irrelevant left-right corner flight)
• 35 bubbles at depth 0..6 → wide bbox → full canvas use (~85-95%)
Bubble radius stays 30px; per-depth x step stays 150px; per-region
band height 240px — all bounded so links can't stretch arbitrarily.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
PR #308 shipped the organic layout. Live verification at 1440px showed:
- bubbles cluster at depth=0 (left ~12% of canvas)
- only 1 edge rendered
Root cause: live Job objects from the backend bridge don't carry their
upstream dependsOn arrays — the bridge surfaces flat status only. The
useJobHints hook was relying on Job.dependsOn + ApplicationDescriptor
deps; both are empty for bootstrap-kit jobs (cilium, cert-manager,
spire, etc.) because they're not user-selected components.
Fix: encode the canonical bootstrap-kit dep graph from
docs/BOOTSTRAP-KIT-EXPANSION-PLAN.md §2 directly in useJobHints, with
a bareName→liveJobId resolver that handles the various id formats
the backend may use ('bp-cnpg' / 'install-cnpg' / 'install-cnpg::r1').
Result: depth populates 0..6 (longest chain cilium → cert-manager →
spire → openbao → keycloak → gitea → catalyst-platform), bubbles
spread across full canvas width via depthToX(depth/maxDepth), edges
render between every parent→child pair.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
In production, handler.New() never assigns h.coreFactory, so phase1_watch
left cfg.CoreFactory == nil. helmwatch.NewWatcher had no default for
CoreFactory (DynamicFactory had one) → the helm-controller log tailer was
never launched → every PhaseComponentLog event was silently dropped.
Result on the live otech cluster: the bridge fix in #307 worked
correctly for state transitions, but the GitLab-style log viewer only
ever saw the synthetic [seeded] / [<state>] anchor lines because the
upstream emission path of raw helm-controller stdout was disconnected.
Fix:
- helmwatch.NewWatcher defaults CoreFactory to
NewKubernetesClientFromKubeconfig (mirroring the existing
DynamicFactory default).
- New regression test TestNewWatcher_DefaultsBothFactories asserts
both factories are non-nil after construction.
Co-authored-by: hatice yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end fix for the JobDetail log viewer. Three stacked bugs surfaced
by https://console.openova.io/sovereign/provision/ce476aaf80731a46/jobs/install-seaweedfs:
A. Frontend constructed `${jobId}:latest` and sent it to
/api/v1/actions/executions/{id}/logs. The catalyst-api resolves
execId by exact match against 16-byte hex IDs — there is no
`:latest` route, so every log fetch returned 404 and the viewer
rendered "Failed to load log page" / "No logs captured for this log".
B. SeedJobsFromInformerList wrote a Job row with status=running for
non-terminal HR states (installing/degraded) but skipped
StartExecution AND set b.lastState[comp]=state. Subsequent
OnHelmReleaseEvent calls with the same state took the prev==state
early-return and never allocated an Execution. 7 jobs on the live
otech cluster were stuck this way.
C. OnProvisionerEvent filtered ev.Phase != "component" and dropped
every PhaseComponentLog event the helmwatch logtailer emits. Raw
helm-controller stdout (one line per reconcile/error/event) never
reached the persisted Execution log file — the GitLab-style viewer
only ever rendered synthetic [seeded] / [<state>] summary lines.
Fixes:
- helmwatch_bridge.go::SeedJobsFromInformerList now allocates an
Execution + writes a [seeded] anchor line for installing/degraded
states. The Execution is left OPEN so OnHelmReleaseEvent and
OnRawComponentLog can keep appending until the HR transitions to a
terminal state.
- helmwatch_bridge.go::OnProvisionerEvent dispatches on Phase:
"component" → OnHelmReleaseEvent (state transitions);
"component-log" → new OnRawComponentLog (raw helm-controller line
appended verbatim to the active Execution). Resolution policy on a
missing in-memory cursor: re-attach to the persisted
LatestExecutionID for non-terminal Jobs; allocate fresh for unknown
Jobs; drop for terminal Jobs (post-install drift-check chatter).
- ui/src/pages/sovereign/useJobDetail.ts (new) — React Query hook
fetches /api/v1/deployments/{id}/jobs/{jobId} and exposes
executions[0].id as the latestExecutionId. 5s poll while the
deployment is in flight.
- ui/src/pages/sovereign/JobDetail.tsx — replaces the synthetic
`${jobId}:latest` with detail.latestExecutionId. When executions[]
is empty, renders ExecutionLogsPlaceholder with status-aware copy
(pending / loading / empty / error) instead of an empty log viewer.
Tests:
- 4 new Go tests on the bridge: raw-log appendsToActiveExecution,
allocatesExecutionWhenJobMissing, dropsAfterTerminal, and
dropsUnknownPhases. Existing seed-idempotency tests updated for
the new "non-terminal seed allocates Execution" contract.
- 2 new vitest cases on JobDetail: uses real executions[0].id (NOT
`${jobId}:latest`) when fetching log lines; renders placeholder
(not viewer) when executions[] is empty.
- All 502 vitest pass; all api Go tests pass; production UI build
clean.
Closes via UAT on https://console.openova.io/sovereign/provision/ce476aaf80731a46/jobs/install-seaweedfs
Refs #204, supersedes the cosmetic #232 surface.
Co-authored-by: hatice yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the stage-column / Sugiyama grid that all prior Flow PRs
inherited (#245, #282, #299, #303, #304). The grid was the actual
cause of the "8x5 squashed in middle 1/3" bug operators kept rejecting
— bubbles spawned in column-grid positions and physics could only
nudge them slightly off the grid.
Per operator spec (2026-04-30):
• Bubbles spread organically across full canvas width.
• X-axis = dependency depth (longest-path-from-root); depth 0 left,
deepest right; 6%-94% of viewport.
• Y-axis = region midpoint + per-node deterministic vertical jitter,
so same-depth siblings scatter naturally — NOT a strict column.
• Edges are bezier curves with status-colored arrowheads, drawn
each tick from live simulation positions.
• NO "STAGE 1/2/..." labels. NO column dividers. NO grid.
• Bubbles draggable (d3-drag); collision avoidance via d3-force.
• Batch view: single-click → BatchSummaryPane (start, finish OR ETA,
duration, succeeded/running/pending/failed counts).
• Batch view: double-click drills via ?scope=batch:<id>&view=jobs
(siblings stay rendered at parent level via the URL scope).
New files:
• src/lib/flowLayoutOrganic.ts — pure data prep (depth, region,
family, edges); NO precomputed positions.
• src/pages/sovereign/FlowCanvasOrganic.tsx — full SVG renderer
with d3-force seed + drag.
• src/pages/sovereign/BatchSummaryPane.tsx — right floating pane
for batch-mode single-click.
Updated:
• FlowPage.tsx — switches imports + renderer; routes batch dbl-click
via ?scope= URL; routes single-click pane by mode.
Old flowLayoutV4.ts + FlowCanvasV4.tsx are kept on disk for now (only
DEFAULT_FAMILIES is still imported); a follow-up PR will delete them.
Per docs/INVIOLABLE-PRINCIPLES.md:
§1 (waterfall) — full target-state organic layout in this PR.
§2 (no compromise) — replace the wrong layout, not patch it.
§8 (disclose divergence) — flowLayoutV4.ts intentionally retained
for the DEFAULT_FAMILIES export only; cleanup follow-up.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>