Two operator-reported bugs:
1. Cloud sub-pages still escaped chroot. PR #998 closed Sidebar/JobsTable/
FlowPage but missed CloudPage (4 navigate sites), CloudListView (2),
UserAccessEditPage (2). Apply the same DETECTED_MODE-aware target
construction so /provision/<id>/cloud paths stay scoped under the
chroot on the mother monitoring view.
2. WizardPage auto-redirected signed-in operators with an inflight
deployment to /provision/<id>/dashboard, blocking the legitimate
case of starting a SECOND provision while the first is still in
flight (founder: 'maybe I'll provision one more').
Replace the auto-redirect with an inline banner at the top of the
wizard pointing at the inflight monitor. The wizard stays
interactive — operator can step through and Launch a second
deployment if they want, OR click 'Open monitor →' to resume the
first one.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
While the operator monitors an in-flight Sovereign from the mothership
wizard surface (`console.openova.io/sovereign/provision/$deploymentId/...`),
every internal link MUST stay scoped under that prefix. Today, three
places escape the chroot to clean root paths intended for the
Sovereign's adult hostname:
1. Sidebar.tsx (mother-monitor sidebar): FLAT_NAV[*].to and SETTINGS_ITEM.to
were hardcoded to clean roots like '/jobs', '/cloud' — clicking a nav
item bounced the operator out of /provision/<id>/* to /sovereign/jobs
(which is either Sovereign-Console route on contabo's mothership view
= 404, or the Sovereign-on-clean-root on adult view = wrong context).
Restore the canonical /provision/$deploymentId/<page> TanStack template;
the params={{ deploymentId }} prop already feeds the substitution.
2. JobsTable.tsx (job row + parent-chip Links): `to=`/jobs/$jobId`` is
valid on the Sovereign adult surface but escapes the chroot on the
mother monitor view. Add a useJobLinkBuilder hook that returns
/provision/<id>/jobs/<jobId> on Catalyst-Zero hostnames and
/jobs/<jobId> on Sovereign hostnames.
3. FlowPage.tsx (canvas leaf-job click navigate): same chroot escape.
Same mode-aware target construction.
The chroot rule (founder framing): the operator CANNOT distinguish
'I'm monitoring my child being born under /provision/<id>/' from
'I'm at home on the adult Sovereign console' visually — every page,
sidebar, link, and chip must look identical (#983 pixel-byte-byte
contract). This commit closes the navigation half of that contract
on the mother side; PR #983 already covered the data-fetch half.
Closes the bug surfaced live on otech118 mid-provision: clicking Jobs
in the sidebar from /sovereign/provision/571a382deb47e50a/dashboard
sent the operator to /sovereign/jobs (404 / wrong scope), and a row
click sent them to /sovereign/jobs/571a382...:install-valkey instead
of /sovereign/provision/<id>/jobs/<id>:install-valkey.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The reverts of #984/#987/#989 brought back three legacy /console/dashboard
redirects that PR #983 had originally cleaned up:
1. auth_handover.go:253 — default redirectTarget on the Sovereign-side
/auth/handover handler.
2. router.tsx:109 — index route's Sovereign-mode redirect.
3. router.tsx:163 — /auth/handover client-side safety-net redirect.
4. auth_handover_test.go fixture — keeps the test in sync.
Closes the loop on PR #983's URL contract.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two places where the wizard navigates after detecting a deployment id:
- WizardPage.tsx:96 — operator opens /sovereign/wizard but already has an
inflight deployment → redirect to that deployment's monitor view.
- StepReview.tsx:792 — operator clicks Launch on the final review step →
POST /api/v1/deployments returns the new id, then redirect to its
monitor view.
Both targets MUST be the per-deployment mothership monitor URL
`/provision/$deploymentId/dashboard`, not the clean Sovereign root
`/dashboard`. PR #983's mass-replace of `/console/$deploymentId/X` →
`/X` accidentally caught these lines too — but Catalyst-Zero (the
mothership wizard) doesn't have a clean `/dashboard` root; it has the
mode-aware /provision/<id>/dashboard surface. The bug surfaces as:
/sovereign/wizard → /sovereign/dashboard (TanStack basepath)
→ SovereignConsoleLayout (mounted on /dashboard)
→ no sovereignFQDN (we're on console.openova.io, not console.<sov-fqdn>)
→ infinite "Authenticating…" spinner
Confirmed live on contabo:8a1fe04 and :019309f. Fixes the wizard ↔
authenticating-loop the founder hit when going to provision otech118.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the clean post-revert image on Sovereigns:
- :019309f is the catalyst-build output for commit 019309f9 (the revert
merge of #984/#987/#989), which carries PR #983's URL contract fix
WITHOUT the broken / → /nova/ redirect chain.
- Chart version bumped 1.4.29 → 1.4.30 to invalidate Flux source-controller's
OCI tag cache (otherwise Sovereigns stay on the first 1.4.29 digest they
pulled — verified live on otech117).
- Chart template literal bumped because PR #980 stops CI from auto-bumping
it; this commit IS the operator-approved manual bump.
Contabo stays on :8a1fe04 (manifest at clusters/contabo-mkt unaffected by
the chart literal change since contabo's Kustomize path reads its own copy
of the deployment manifests). When the operator validates :019309f on
Sovereigns, contabo can be re-pinned in a follow-up.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
console.openova.io is currently 307'ing / → /nova/ instead of rendering
the wizard. Founder identified :8a1fe04 as the last stable image before
today's auth-loop / mothership-redirect chain (#984#987#989).
Revert chain summary:
- :8a83416 (#984): mothership / redirect landed on /nova marketplace
- :e221b48 (#987): tried to fix#984 — exposed wizard redirect loop
- :0daaac5 (#989): tried to break #987's loop — / still 307s to /nova
on live contabo
This pin restores the operator-facing wizard flow on console.openova.io.
Sovereigns are unaffected (otech117 is on :8a83416 via Helm, gated by
chart 1.4.29 OCI cache and not re-pulling per the source-controller
version-key cache behavior).
Forward path: investigate the / → /nova/ redirect introduced in the
#984/#987/#989 chain (likely an index-route or beforeLoad redirect in
router.tsx that fires on Catalyst-Zero mode), fix at root, ship as a
new image SHA, then re-pin contabo deliberately.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WizardPage and StepReview both call navigate({to:'/dashboard',
params:{deploymentId}}) when an inflight deployment is detected. On
the mothership the bare /dashboard matches the Sovereign-Console
clean-root route which renders SovereignConsoleLayout — that layout's
mothership-fall-through guard (added in #987) redirects back to
/sovereign/, indexRoute redirects to /wizard, and WizardPage sees
inflight again and re-fires the navigate, looping forever between
/sovereign/, /sovereign/wizard, /sovereign/dashboard.
Fix: distinguish DETECTED_MODE.mode in both call sites:
- 'sovereign' (per-Sovereign self-mode SPA): /dashboard (clean root)
- 'catalyst-zero' (mothership): /provision/$deploymentId/dashboard
This is the third lap of #976's clean-URL cleanup catching mothership
flows that weren't migrated to the parameterised routes.
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
The previous fix redirected SovereignConsoleLayout's mothership-fall-
through to bare '/', which the contabo nginx 302s to '/nova/' (the SME
marketplace). That yanked the operator out of the
sovereign-provisioning flow entirely — observed live: clicking any
clean-root Sovereign-Console route on console.openova.io ended up on
marketplace.openova.io/checkout.
The right landing on the mothership is '/sovereign/' — the Vite base
path the catalyst-ui SPA is mounted at, which serves the wizard /
provisioning surface.
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Bumps the chart version + the per-Sovereign HelmRelease pin in
clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml so all
Sovereigns reconciling against the template (otech117 et al.) pick up
PR #983's fixes:
- /dashboard /apps /jobs /cloud … render at clean roots; no /console/
prefix and no /provision/<id>/ prefix on Sovereign mode.
- sovereign_self.go store fallback — data flows on clean URLs the
moment fireHandover POSTs the deployment record to /api/v1/internal/
deployments/import; no waiting for a chart-values overlay roundtrip.
- Sidebar links land on clean roots — no more /provision//cloud.
- Auth handover redirect target → /dashboard (was /console/dashboard).
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(sovereign-console): land URL contract on Sovereign — clean roots, real data, working sidebar
Three operator-visible bugs on console.<sov-fqdn> after the PR #976/#977
clean-URL split landed:
1. **Login redirected to /provision/<id> instead of /dashboard.**
auth_handover.go's redirect default still pointed at the legacy
/console/dashboard path. The router's /auth/handover safety-net
redirect, the index-route mode-aware redirect, and AuthCallbackPage
all still navigated to /console/dashboard too. None of those routes
exist on the Sovereign router any more (PR #972 deleted ConsolePage*),
so the browser fell back to the closest matching prefix
/provision/$deploymentId/...
2. **Sidebar Cloud → /provision//cloud (empty deploymentId).**
SovereignSidebar.tsx's FLAT_NAV / SETTINGS_ITEM / SETTINGS_SUB_NAV
all still pointed at /console/X paths that don't resolve. The
browser fell through to the wizard sidebar's /provision/$id/cloud
route, but with deploymentId resolved to '' (we're on Sovereign
mode, no URL param), producing /provision//cloud.
3. **Clean roots showed no data; data only at /provision/<id>/...**
The /api/v1/sovereign/self endpoint returned 503
deployment-id-not-yet-stamped because CATALYST_SELF_DEPLOYMENT_ID
env was empty (orchestrator hasn't yet shipped the values-overlay
write that stamps it via the chart). useResolvedDeploymentId
resolved null, every page that depends on it (Dashboard, Jobs,
Cloud, etc.) had no id to fetch with.
Fixes:
- auth_handover.go + handler.go + auth_handover_test.go: redirect
default /dashboard.
- router.tsx + AuthCallbackPage.tsx: index + handover safety-net +
callback all redirect to /dashboard.
- SovereignSidebar.tsx: FLAT_NAV / SETTINGS / SETTINGS_SUB_NAV use
clean roots; deriveActiveSection regexes match clean roots.
- SovereignConsoleLayout.tsx: Settings dropdown nav target /settings.
- cloudListShared.tsx + CloudNetworkPage.tsx + CloudStoragePage.tsx:
Links use mode-aware path (sovereignPath helper for the back-link;
inline DETECTED_MODE branch for the deeper sub-route tile links).
- sovereign_self.go: store-fallback resolution — when env is empty
but the local store holds a deployment record whose SovereignFQDN
matches CATALYST_OTECH_FQDN, return that record's id. The cutover
import endpoint enforces FQDN match before persisting, so a single
matching record is unambiguously this Sovereign's. This makes data
flow on clean URLs the moment fireHandover's POST /import lands,
without waiting for a chart-values overlay write + Flux reconcile.
Closes the user-reported "actual data is still staying in the cilder
of the mother concept under provisioning urls" + "clicking on cloud
goes to /provision//cloud" symptoms on otech117.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalyst-ui): SovereignConsoleLayout redirects to / on mothership instead of looping on "Authenticating…" (#975)
When the operator hits a clean-root Sovereign-Console route (/dashboard,
/apps, etc.) on the mothership (console.openova.io), DETECTED_MODE
returns sovereignFQDN=null — those routes exist for the per-Sovereign
self-mode SPA mounted at console.<sov-fqdn>, not for catalyst-zero.
Without an FQDN there is no Keycloak realm to OIDC against, so initAuth
would set authState='unauthenticated' and the layout's loading branch
rendered the spinner with "Authenticating…" caption forever — the
hang the founder hit immediately after #976 + #975 deploys when
clicking any dashboard/apps/cloud link on the mothership.
Redirect to / instead so the operator lands on the wizard /
deployments list, which is the right surface for catalyst-zero.
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Restores forward roll of the catalyst-{api,ui} Kustomize-path image
refs after the hotfix landed:
- 3b88dfa hotfix(catalyst-api): drop k8scache discovery probe
- b4fb6cf fix(catalyst-ui): drop stale params={{ deploymentId }}
Per #980, contabo Kustomize-path image refs are managed manually
(catalyst-build only auto-bumps values.yaml). This commit is the
manual forward-roll.
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Bug: contabo mothership stuck during catalyst-api boot, "iterating dead
clusters". Root cause is a regression introduced by the k8scache PR:
AddCluster gained a synchronous `core.Discovery().ServerResourcesForGroupVersion(gv)`
call to gate Optional kinds (metrics.k8s.io/PodMetrics) — that call
issues a REST GET against the cluster's apiserver with NO context
timeout. On a kubeconfig pointing at a dead machine (a decommissioned
otech whose <id>.yaml was never removed) the call hangs until the
underlying TCP connect times out (often minutes). With many dead
kubeconfigs in /var/lib/catalyst/kubeconfigs the boot path serially
blocks for tens of minutes.
Fix:
- Drop the discovery probe block entirely. AddCluster is again
synchronous-network-free; informers spawn unconditionally and
reflectors handle missing GVRs (404 from the apiserver) with their
own backoff retry loop in goroutines that don't block startup.
- Drop PodMetrics from DefaultKinds. With the probe gone, an
always-registered PodMetrics informer would log retry warnings
forever on every Sovereign without metrics-server. Until a non-
blocking activation path lands the dashboard's color_by=utilization
returns null when no PodMetrics indexer exists; health/age/size
paths still ride the Pod + PVC indexers untouched.
- Drop Kind.Optional field, the two probe-specific tests, and the
fakediscovery import. Update TestDefaultKinds_GraphAndDashboardSurface
to assert PodMetrics is *absent* from the defaults.
- Update dashboard_test.go's local Optional kind registration accordingly.
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
* fix(catalyst-ui): drop stale params={{ deploymentId }} from clean-root Links (#975)
#976 collapsed `to="/provision/$deploymentId/<page>"` to clean root
paths (`to="/<page>"`) but left the `params={{ deploymentId }}` prop
on every callsite, breaking the Vite tsc build with TS2353. Fixes:
- Drop `params={{ deploymentId }}` from Links whose target is now a
parameterless clean root path (StatusStrip, AppDetail, AppsPage,
DecommissionPage, FlowPage, JobDetail, JobsPage, JobsTimeline,
SettingsPage, DeploymentsList).
- For Links whose `to` still uses `$componentId`/`$jobId`, cast
`params` with `as never` to match the existing pattern in
cloud-compute/cloud-network/cloud-storage/Sidebar/UserAccess
(the dual-mount under provisionRoute + consoleLayoutRoute defeats
TS's strict params inference; the runtime path is correct).
- Drop `deploymentId` prop + interface field from JobCard / JobRow /
JobsTable / AppCard now that the Links don't need it; update test
fixtures + the JobsTable row-link assertion to match the new
clean `/jobs/$jobId` href.
- Drop the unused ArchEdgeType import in k8sAdapter (TS6196).
- Dashboard navigateToApp uses `as never` casts to align with the
same pattern.
* fix(catalyst-build): stop auto-bumping contabo Kustomize-path image refs
Two paths consume the catalyst-api / catalyst-ui images:
1. bp-catalyst-platform OCI chart (Sovereigns) — values.yaml driven, tag
in values.yaml is rendered at helm install time by Sovereign Flux.
2. contabo Kustomize-path — literal image refs in templates/api-deployment.yaml
and templates/ui-deployment.yaml. Flux kustomize-controller on contabo
reconciles those files directly.
The CI deploy step was bumping BOTH on every PR, which auto-rolled
contabo every time anyone merged a catalyst-api code change. On
2026-05-05 PR #975's k8scache feature broke contabo startup on the
auto-roll because contabo has 27 dead-Sovereign kubeconfigs that the
new code iterates synchronously at startup, blocking readiness.
Fix: keep the values.yaml bump (Sovereigns auto-pick-up via OCI chart
which is the right behaviour for fresh provisions). Drop the
templates/*-deployment.yaml bump so contabo only rolls when an
operator manually commits a validated SHA into those files.
Closes the auto-deploy-to-contabo blast radius on every PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(catalyst-ui): cloud-graph K8s projection + dashboard squarer tiles (#975)
Architecture graph (cloud?view=graph) — surface live K8s workloads:
- New widgets/architecture-graph/k8sAdapter.ts emits Pod / Deployment /
StatefulSet / DaemonSet / Service / Ingress / Namespace / ConfigMap /
PVC / Node graph nodes from a normalized K8s snapshot.
- Edge inference: Pod→WorkerNode runs-on (.spec.nodeName), Pod→
Namespace member-of, Pod→Workload via ownerRef chain (collapsing the
ReplicaSet hop to attribute Pods directly to their parent Deployment),
Service→Pod routes-to (EndpointSlice when present, label-selector
fallback otherwise), Ingress→Service flows-to, Pod→PVC attached-to,
PVC→Volume.hcloud realizes via PV csi.volumeAttributes.
- mergeGraphs unions cloud-side and K8s-side adapter outputs and
collapses the WorkerNode↔Node bridge by id; K8s status wins for
liveness, cloud-side metadata for SKU.
- New widgets/architecture-graph/useK8sCacheStream.ts subscribes to
/api/v1/sovereigns/{id}/k8s/stream?initialState=1 via EventSource,
applies ADDED/MODIFIED/DELETED deltas to an in-memory Map snapshot,
bumps a revision counter so the adapter recomputes only when
events arrive. jsdom guard so component tests render without SSE.
- ArchitectureGraphPage wires both adapters; Pod/ConfigMap chips are
default-off (DEFAULT_INACTIVE_TYPES) so the canvas isn't crowded
before the operator opts in. New TUNABLE_TYPES include the K8s
high-cardinality kinds.
- 13 new unit tests cover ownerRef chain, EndpointSlice+selector
fallback, Ingress backend resolution, Pod→PVC, PVC→Volume.hcloud
bridge, WorkerNode↔Node merge, edge dangling-endpoint filtering.
Dashboard (/dashboard) — square tiles + null-utilization rendering:
- Recharts <Treemap aspectRatio={1}/> so cells render close to square
whenever the value distribution allows (founder feedback 2026-05-05).
- Cell renderers handle percentage===null: NULL_PERCENTAGE_FILL grey
fill, '— %' label, tooltip "metrics-server not installed" when
colorBy=utilization without metrics, "no data" otherwise.
- TreemapItem.percentage type is now number | null end-to-end.
Companion to #976 backend (k8scache prep + dashboard.go rewrite).
* fix(catalyst-ui): rip out hardcoded /provision/$deploymentId from internal Link components
Sidebar + JobsTable + AppsPage + JobsPage + JobsTimeline + JobDetail +
Dashboard + AppDetail + DecommissionPage + DeploymentsList +
SettingsPage + StatusStrip + FlowPage all had hardcoded
`to="/provision/$deploymentId/<page>"` references that bound the
operator to the mother view URL forever — clicking any link from a
Sovereign self-mode page would jump to the (non-existent on Sovereign)
mother provision URL.
Mass-replaced with clean root paths `to="/<page>"` so internal
navigation on a Sovereign child stays on clean URLs (/dashboard,
/apps, /jobs, /cloud, /users, /settings).
Also deleted the now-unused SovereignConsoleRedirect.tsx
(superseded by direct route mounting in router.tsx).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The data half of the mother→child contract that PR #976 set up the
URL routing for. At handover the mother POSTs the full deployment
record (events, jobs history, HRs, cloud topology, kubeconfig meta)
to the child's POST /api/v1/internal/deployments/import — the child
persists it locally so its /api/v1/deployments/{id}/* endpoints
answer with byte-byte-identical data the operator sees on the mother
view at /sovereign/provision/<id>/<page>.
Result: on the child cluster, clean URLs (/dashboard, /apps, /jobs,
/cloud) render with REAL data (events, exec logs, job statuses,
treemap utilisation) instead of empty arrays.
- New endpoint: POST /api/v1/internal/deployments/import (child)
Validates by FQDN match against CATALYST_OTECH_FQDN. Idempotent.
- Mother fireHandover() now posts the record to the child after the
JWT mint as a fire-and-forget goroutine. Failure logs loudly per
INVIOLABLE-PRINCIPLES #3 but does not block SSE emit.
Bumped: bp-catalyst-platform 1.4.27 → 1.4.28.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Build #25388329130 failed on PR #972's merge SHA `6ec7851` with two
TS6133 unused-symbol errors:
src/app/router.tsx(86,1): error TS6133: 'SovereignConsoleRedirect' is declared but its value is never read.
src/pages/sovereign/Dashboard.tsx(133,46): error TS6133: 'idLoading' is declared but its value is never read.
The SovereignConsoleRedirect helper became unused once the /console/*
routes were wired directly to the canonical components (Dashboard,
AppsPage, JobsPage, CloudPage, UserAccessListPage, SettingsPage) in
the same PR. The Dashboard's idLoading binding was a leftover from an
earlier draft that surfaced a loading pill.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(sovereign-console): kill duplicate /console/* pages, redirect to canonical /provision/$id/* (Iteration 1)
Founder-reported on otech116/117: the /console/dashboard, /console/apps,
/console/jobs, /console/cloud, /console/users, /console/settings pages
are STUBS that look completely different from the canonical Sovereign
Console operators see at console.openova.io/sovereign/provision/$id/*.
Investigation: 6 duplicate Console*Page React components were shipped in
PR #937 — separate stub implementations of pages that already exist as
the canonical Dashboard / AppsPage / JobsPage / CloudPage /
UserAccessListPage / SettingsPage components used by the
/provision/$deploymentId/* route tree (the same the wizard renders).
Fix (Iteration 1):
- DELETE the 6 duplicate Console*Page components.
- Replace the /console/* router routes with SovereignConsoleRedirect:
a tiny component that fetches /api/v1/sovereign/self for the
Sovereign's own deployment id, then router-navigates to the
canonical /provision/<self-id>/<page>. Same components, same data,
pixel-byte-byte-identical UI to the mothership view.
- Add catalyst-api endpoint GET /api/v1/sovereign/self that returns
the deployment id from CATALYST_SELF_DEPLOYMENT_ID env. Mothership
(env unset) → 404. Sovereign with stamped id → 200. Sovereign
pre-handover → 503 deployment-id-not-yet-stamped.
- Wire env via the existing sovereign-fqdn ConfigMap (B1 PR #912):
new key `selfDeploymentId`, sourced from
.Values.global.sovereignSelfDeploymentId. Empty until the
orchestrator's per-Sovereign overlay writer stamps it.
- Add useResolvedDeploymentId React hook (URL params first, then
/sovereign/self fallback) — wires Iteration 2 (clean URLs) below.
Iteration 2 (next PR — out of scope here):
- Drop the /sovereign/provision/<id>/ URL prefix on Sovereign by
refactoring 6 canonical components to use useResolvedDeploymentId
instead of strict useParams. Then /console/dashboard renders the
canonical Dashboard at the clean URL with deployment id resolved
from /sovereign/self.
Iteration 3 (next PR after — also out of scope):
- Handover history transfer: contabo's catalyst-api at handover POSTs
the full deployment record (events, jobs, HRs, cloud topology) to
the Sovereign's catalyst-api so /provision/<id>/* on the Sovereign
answers with byte-byte-identical data.
Bumped: bp-catalyst-platform 1.4.26 → 1.4.27.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(sovereign-console): clean URLs — /console/* mounts canonical components directly
Removes the SovereignConsoleRedirect indirection. The 6 canonical
operator components (Dashboard, AppsPage, JobsPage, JobDetail,
CloudPage, AppDetail, UserAccessListPage, UserAccessEditPage,
SettingsPage) now render at clean /console/<page> URLs on Sovereign,
NOT under /sovereign/provision/<id>/<page>.
Pages that previously hard-coupled to the URL via
useParams({ from: '/provision/$deploymentId/...' })
now use useResolvedDeploymentId() which:
1. reads URL params (when on the legacy /provision/$id/* tree on
contabo's mothership wizard)
2. falls back to GET /api/v1/sovereign/self (Sovereign self-discovery)
Refactored: Dashboard, AppsPage, JobsPage, SettingsPage, UserAccessListPage.
CloudPage already used strict:false — no change needed.
Wires the /console/* router subtree to the canonical components +
adds the missing children routes (/jobs/$jobId, /users/new,
/users/$name, /app/$componentId) so the canonical UI's deep-links
work on the clean URL surface too.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
## Root cause (live on otech115 2026-05-05 14:15)
After PR #959 (0.1.18) unblocked the auto-trigger to actually call
/internal/cutover/trigger, the cutover engine fired Step-01 within ~8s
of bp-self-sovereign-cutover Helm-install completing. The gitea Pod
had only just reached Ready state — cluster-DNS endpoint publication
for the headless service `gitea-http` was still in flight. One wget
returned `bad address gitea-http.gitea.svc.cluster.local` and exited
non-zero. Catalyst-api's cutover engine stamped Jobs with backoffLimit=0
(cutover.go:584), so a single DNS miss was terminal and aborted all 8
cutover steps. otech115 finished provisioning with cutoverComplete=false
and tethered to upstream github.com/ghcr.io.
## Fix (dual-layer)
**Layer A — catalyst-api (cutover.go)**: backoffLimit lifted from 0 to 3.
A single transient miss is recoverable (4 attempts over each step's
activeDeadlineSeconds) without burning operator-attention. Hard failures
still surface within budget.
**Layer B — chart Step-01 (01-gitea-mirror-job.yaml)**: explicit
nslookup readiness probe at the top of the bash script, before any
wget call. 30 attempts × 5s = 150s budget; alpine/git ships nslookup
in /usr/bin (verified live on otech115). Layer B is faster than Layer A
(in-script DNS retry vs Pod recreate); Layer A is the safety net for
any other transient pre-cluster-stable race we haven't yet enumerated.
## Acceptance gate
Test case 15 added to platform/self-sovereign-cutover/chart/tests/
cutover-contract.sh — guards against future regressions that drop
either the gitea_host extraction or the nslookup loop.
## Live verification
Will fire on the next provision (otech116). Expected:
- Step-01 logs `[gitea-mirror] DNS ready for gitea-http.gitea.svc.cluster.local (attempt N)`
- All 8 cutover Jobs reach Complete
- self-sovereign-cutover-status ConfigMap reaches cutoverComplete=true
Co-authored-by: e3mrah <ebaysal@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes#921 — bp-cluster-autoscaler-hcloud chart shipped without
HCLOUD_CLUSTER_CONFIG / HCLOUD_CLOUD_INIT, so cluster-autoscaler 1.32.x
FATALs at startup with "HCLOUD_CLUSTER_CONFIG or HCLOUD_CLOUD_INIT is
not specified" on every Sovereign (otech112 evidence). HelmRelease
reports Ready=True (Helm install succeeded) but the Pod
CrashLoopBackOffs invisibly behind the False-positive condition.
Closes#916 — wizard let operators dispatch unbuildable topologies
(otech109: cpx32 worker in `ash`) because PROVIDER_NODE_SIZES did not
encode regional orderability. Hetzner rejected the worker creation 41s
into `tofu apply` after Phase-0 had already created the CP + network +
LB + firewall.
Chart fix (issue #921):
- Add `clusterAutoscalerHcloud.{clusterConfig,cloudInit}` values to the
umbrella chart (base64-encoded per upstream contract).
- Render `hetzner-node-config` Secret unconditionally with both keys so
the upstream Deployment's secretKeyRef references resolve cleanly
during `helm template` AND in the live cluster regardless of overlay
state.
- Wire HCLOUD_CLUSTER_CONFIG + HCLOUD_CLOUD_INIT extraEnvSecrets onto
the upstream chart's deployment.
- Tofu Phase 0 base64-encodes the Phase-0 worker cloud-init and stamps
it under `flux-system/cloud-credentials.hcloud-cloud-init`; the
bootstrap-kit overlay lifts that key via Flux `valuesFrom` into
`clusterAutoscalerHcloud.cloudInit`. Autoscaler-spawned workers thus
receive the IDENTICAL bootstrap as the Phase-0 worker fleet.
- Bump bp-cluster-autoscaler-hcloud chart 1.0.0 → 1.1.0.
- Chart-test smoke gate (chart/tests/hetzner-node-config.sh) verifies
Secret + env var wiring + no-regression of HCLOUD_TOKEN — runs in CI's
blueprint-release "Run chart integration tests" step.
Wizard fix (issue #916):
- Add `availableRegions?: string[]` to NodeSize interface; encode
cpx32 = ['fsn1','nbg1','hel1'], cpx21/cpx31 = [] (orderable nowhere
new) per Hetzner /v1/server_types vs POST /v1/servers gap.
- Add `isSkuAvailableInRegion()` + `suggestAlternativeSkus()` helpers.
- StepProvider filters SKU dropdowns by selected region; auto-swaps
current SKU to recommended default when region change drops it out
of orderability.
- Mirror the matrix Go-side in sku_availability.go; gate
`provisioner.Request.Validate()` with same predicate so a stale
wizard build OR direct API caller bypassing the UI cannot dispatch
otech109's failure mode.
- Two-sided enforcement covers both r.Regions[] (multi-region) and the
legacy singular path.
Tests: 13 vitest cases on the wizard side + 38 Go subtests on the API
side. Chart smoke renders + helm template gates the env wiring at
publish time.
Co-authored-by: hatiyildiz <hati.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>