Commit Graph

789 Commits

Author SHA1 Message Date
e3mrah
1b85ab9227
chore(bp-catalyst-platform): bump 1.4.33 → 1.4.34 + literal :11dd19e → :b45a49f (#1000 cloud chroot + wizard banner) (#1003)
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-05 23:44:03 +04:00
e3mrah
b45a49ff96
fix: cloud chroot escapes + wizard-inflight banner instead of auto-redirect (#1002)
Two operator-reported bugs:

1. Cloud sub-pages still escaped chroot. PR #998 closed Sidebar/JobsTable/
   FlowPage but missed CloudPage (4 navigate sites), CloudListView (2),
   UserAccessEditPage (2). Apply the same DETECTED_MODE-aware target
   construction so /provision/<id>/cloud paths stay scoped under the
   chroot on the mother monitoring view.

2. WizardPage auto-redirected signed-in operators with an inflight
   deployment to /provision/<id>/dashboard, blocking the legitimate
   case of starting a SECOND provision while the first is still in
   flight (founder: 'maybe I'll provision one more').

   Replace the auto-redirect with an inline banner at the top of the
   wizard pointing at the inflight monitor. The wizard stays
   interactive — operator can step through and Launch a second
   deployment if they want, OR click 'Open monitor →' to resume the
   first one.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 23:43:52 +04:00
github-actions[bot]
7f4b886094 deploy: update catalyst images to 9964cee 2026-05-05 19:39:07 +00:00
github-actions[bot]
aaa0cb0207 deploy: update catalyst images to b15f08b 2026-05-05 19:29:26 +00:00
e3mrah
b15f08bc1e
chore(bp-catalyst-platform): bump 1.4.32 → 1.4.33 + literal :1af1c0d → :11dd19e (#998 chroot fix) (#999)
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-05 23:27:12 +04:00
e3mrah
11dd19e519
fix(provision-monitor): chroot-correct paths in Sidebar / JobsTable / FlowPage (#983 follow-up) (#998)
While the operator monitors an in-flight Sovereign from the mothership
wizard surface (`console.openova.io/sovereign/provision/$deploymentId/...`),
every internal link MUST stay scoped under that prefix. Today, three
places escape the chroot to clean root paths intended for the
Sovereign's adult hostname:

1. Sidebar.tsx (mother-monitor sidebar): FLAT_NAV[*].to and SETTINGS_ITEM.to
   were hardcoded to clean roots like '/jobs', '/cloud' — clicking a nav
   item bounced the operator out of /provision/<id>/* to /sovereign/jobs
   (which is either Sovereign-Console route on contabo's mothership view
   = 404, or the Sovereign-on-clean-root on adult view = wrong context).
   Restore the canonical /provision/$deploymentId/<page> TanStack template;
   the params={{ deploymentId }} prop already feeds the substitution.

2. JobsTable.tsx (job row + parent-chip Links): `to=`/jobs/$jobId`` is
   valid on the Sovereign adult surface but escapes the chroot on the
   mother monitor view. Add a useJobLinkBuilder hook that returns
   /provision/<id>/jobs/<jobId> on Catalyst-Zero hostnames and
   /jobs/<jobId> on Sovereign hostnames.

3. FlowPage.tsx (canvas leaf-job click navigate): same chroot escape.
   Same mode-aware target construction.

The chroot rule (founder framing): the operator CANNOT distinguish
'I'm monitoring my child being born under /provision/<id>/' from
'I'm at home on the adult Sovereign console' visually — every page,
sidebar, link, and chip must look identical (#983 pixel-byte-byte
contract). This commit closes the navigation half of that contract
on the mother side; PR #983 already covered the data-fetch half.

Closes the bug surfaced live on otech118 mid-provision: clicking Jobs
in the sidebar from /sovereign/provision/571a382deb47e50a/dashboard
sent the operator to /sovereign/jobs (404 / wrong scope), and a row
click sent them to /sovereign/jobs/571a382...:install-valkey instead
of /sovereign/provision/<id>/jobs/<id>:install-valkey.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 23:25:02 +04:00
github-actions[bot]
643f9df9dd deploy: update catalyst images to 2e493fc 2026-05-05 19:09:03 +00:00
e3mrah
2e493fc4f7
chore(bp-catalyst-platform): bump 1.4.31 → 1.4.32 + literal :ffe3607 → :1af1c0d (#996 redirect fixes) (#997)
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-05-05 23:07:04 +04:00
e3mrah
1af1c0d221
fix(redirects): /console/dashboard → /dashboard in 3 remaining sites (#983 follow-up) (#996)
The reverts of #984/#987/#989 brought back three legacy /console/dashboard
redirects that PR #983 had originally cleaned up:

1. auth_handover.go:253 — default redirectTarget on the Sovereign-side
   /auth/handover handler.
2. router.tsx:109 — index route's Sovereign-mode redirect.
3. router.tsx:163 — /auth/handover client-side safety-net redirect.
4. auth_handover_test.go fixture — keeps the test in sync.

Closes the loop on PR #983's URL contract.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 23:06:20 +04:00
github-actions[bot]
5aee0a3a91 deploy: update catalyst images to 498a025 2026-05-05 19:02:32 +00:00
e3mrah
498a02549a
chore(bp-catalyst-platform): bump 1.4.30 → 1.4.31 + literal :019309f → :ffe3607 (#995)
Lands #994's wizard redirect fix on contabo + Sovereigns.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 23:00:33 +04:00
e3mrah
ffe3607f6c
fix(wizard): redirect inflight + post-submit to /provision/$deploymentId/dashboard not /dashboard (#994)
Two places where the wizard navigates after detecting a deployment id:
- WizardPage.tsx:96 — operator opens /sovereign/wizard but already has an
  inflight deployment → redirect to that deployment's monitor view.
- StepReview.tsx:792 — operator clicks Launch on the final review step →
  POST /api/v1/deployments returns the new id, then redirect to its
  monitor view.

Both targets MUST be the per-deployment mothership monitor URL
`/provision/$deploymentId/dashboard`, not the clean Sovereign root
`/dashboard`. PR #983's mass-replace of `/console/$deploymentId/X` →
`/X` accidentally caught these lines too — but Catalyst-Zero (the
mothership wizard) doesn't have a clean `/dashboard` root; it has the
mode-aware /provision/<id>/dashboard surface. The bug surfaces as:

  /sovereign/wizard → /sovereign/dashboard (TanStack basepath)
  → SovereignConsoleLayout (mounted on /dashboard)
  → no sovereignFQDN (we're on console.openova.io, not console.<sov-fqdn>)
  → infinite "Authenticating…" spinner

Confirmed live on contabo:8a1fe04 and :019309f. Fixes the wizard ↔
authenticating-loop the founder hit when going to provision otech118.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 22:59:58 +04:00
github-actions[bot]
51dac92fa1 deploy: update catalyst images to 92f1eb8 2026-05-05 18:44:21 +00:00
e3mrah
92f1eb8468
chore(bp-catalyst-platform): bump 1.4.29 → 1.4.30 + chart literal :8a1fe04 → :019309f (#993)
Lands the clean post-revert image on Sovereigns:

- :019309f is the catalyst-build output for commit 019309f9 (the revert
  merge of #984/#987/#989), which carries PR #983's URL contract fix
  WITHOUT the broken / → /nova/ redirect chain.
- Chart version bumped 1.4.29 → 1.4.30 to invalidate Flux source-controller's
  OCI tag cache (otherwise Sovereigns stay on the first 1.4.29 digest they
  pulled — verified live on otech117).
- Chart template literal bumped because PR #980 stops CI from auto-bumping
  it; this commit IS the operator-approved manual bump.

Contabo stays on :8a1fe04 (manifest at clusters/contabo-mkt unaffected by
the chart literal change since contabo's Kustomize path reads its own copy
of the deployment manifests). When the operator validates :019309f on
Sovereigns, contabo can be re-pinned in a follow-up.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 22:41:42 +04:00
e3mrah
019309f9b7
revert: drop the #984#987#989 broken redirect chain (#992)
* Revert "fix(wizard): mode-aware redirect target — break /sovereign/wizard ↔ /sovereign/dashboard loop (#975) (#989)"

This reverts commit 0daaac5bd5.

* Revert "fix(catalyst-ui): mothership redirect goes to /sovereign/ not / (#975) (#987)"

This reverts commit e221b4825f.

* Revert "fix(catalyst-ui): redirect mothership off clean-root Sovereign-Console routes (#975) (#984)"

This reverts commit 8a83416f0b.

---------

Co-authored-by: e3mrah <1234567+e3mrah@users.noreply.github.com>
2026-05-05 22:34:36 +04:00
github-actions[bot]
792978525d deploy: update catalyst images to bd97424 2026-05-05 18:34:21 +00:00
e3mrah
bd9742413f
rollback(contabo): pin catalyst-{api,ui} :0daaac5 → :8a1fe04 — last user-confirmed stable (#991)
console.openova.io is currently 307'ing / → /nova/ instead of rendering
the wizard. Founder identified :8a1fe04 as the last stable image before
today's auth-loop / mothership-redirect chain (#984 #987 #989).

Revert chain summary:
- :8a83416 (#984): mothership / redirect landed on /nova marketplace
- :e221b48 (#987): tried to fix #984 — exposed wizard redirect loop
- :0daaac5 (#989): tried to break #987's loop — / still 307s to /nova
  on live contabo

This pin restores the operator-facing wizard flow on console.openova.io.
Sovereigns are unaffected (otech117 is on :8a83416 via Helm, gated by
chart 1.4.29 OCI cache and not re-pulling per the source-controller
version-key cache behavior).

Forward path: investigate the / → /nova/ redirect introduced in the
#984/#987/#989 chain (likely an index-route or beforeLoad redirect in
router.tsx that fires on Catalyst-Zero mode), fix at root, ship as a
new image SHA, then re-pin contabo deliberately.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 22:32:05 +04:00
github-actions[bot]
84bda66332 deploy: update catalyst images to 5c7d5dd 2026-05-05 18:27:06 +00:00
e3mrah
5c7d5ddb8b
deploy(contabo): pin :e221b48 → :0daaac5 — break wizard redirect loop (#990)
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 22:24:36 +04:00
github-actions[bot]
3a10eee0cc deploy: update catalyst images to 0daaac5 2026-05-05 18:23:54 +00:00
e3mrah
0daaac5bd5
fix(wizard): mode-aware redirect target — break /sovereign/wizard ↔ /sovereign/dashboard loop (#975) (#989)
WizardPage and StepReview both call navigate({to:'/dashboard',
params:{deploymentId}}) when an inflight deployment is detected. On
the mothership the bare /dashboard matches the Sovereign-Console
clean-root route which renders SovereignConsoleLayout — that layout's
mothership-fall-through guard (added in #987) redirects back to
/sovereign/, indexRoute redirects to /wizard, and WizardPage sees
inflight again and re-fires the navigate, looping forever between
/sovereign/, /sovereign/wizard, /sovereign/dashboard.

Fix: distinguish DETECTED_MODE.mode in both call sites:
- 'sovereign' (per-Sovereign self-mode SPA): /dashboard (clean root)
- 'catalyst-zero' (mothership): /provision/$deploymentId/dashboard

This is the third lap of #976's clean-URL cleanup catching mothership
flows that weren't migrated to the parameterised routes.

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 22:21:05 +04:00
github-actions[bot]
6498eff476 deploy: update catalyst images to 678cb40 2026-05-05 18:14:26 +00:00
e3mrah
678cb40411
deploy(contabo): pin :8a83416 → :e221b48 — redirect lands on /sovereign/ not /nova/ (#988)
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 22:12:23 +04:00
github-actions[bot]
5098f4003c deploy: update catalyst images to e221b48 2026-05-05 18:11:45 +00:00
e3mrah
e221b4825f
fix(catalyst-ui): mothership redirect goes to /sovereign/ not / (#975) (#987)
The previous fix redirected SovereignConsoleLayout's mothership-fall-
through to bare '/', which the contabo nginx 302s to '/nova/' (the SME
marketplace). That yanked the operator out of the
sovereign-provisioning flow entirely — observed live: clicking any
clean-root Sovereign-Console route on console.openova.io ended up on
marketplace.openova.io/checkout.

The right landing on the mothership is '/sovereign/' — the Vite base
path the catalyst-ui SPA is mounted at, which serves the wizard /
provisioning surface.

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 22:09:22 +04:00
github-actions[bot]
a26d7482d6 deploy: update catalyst images to e8fcd66 2026-05-05 18:06:48 +00:00
e3mrah
e8fcd66a2b
chore(bp-catalyst-platform): bump 1.4.28 → 1.4.29 — pulls in #983 URL contract (#986)
Bumps the chart version + the per-Sovereign HelmRelease pin in
clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml so all
Sovereigns reconciling against the template (otech117 et al.) pick up
PR #983's fixes:

- /dashboard /apps /jobs /cloud … render at clean roots; no /console/
  prefix and no /provision/<id>/ prefix on Sovereign mode.
- sovereign_self.go store fallback — data flows on clean URLs the
  moment fireHandover POSTs the deployment record to /api/v1/internal/
  deployments/import; no waiting for a chart-values overlay roundtrip.
- Sidebar links land on clean roots — no more /provision//cloud.
- Auth handover redirect target → /dashboard (was /console/dashboard).

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 22:04:39 +04:00
e3mrah
edf8c0e553
deploy(contabo): bump pin :b4fb6cf → :8a83416 — auth-loop fix (#985)
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 22:00:00 +04:00
github-actions[bot]
403d7d53a3 deploy: update catalyst images to 8a83416 2026-05-05 17:59:17 +00:00
e3mrah
8a83416f0b
fix(catalyst-ui): redirect mothership off clean-root Sovereign-Console routes (#975) (#984)
* fix(sovereign-console): land URL contract on Sovereign — clean roots, real data, working sidebar

Three operator-visible bugs on console.<sov-fqdn> after the PR #976/#977
clean-URL split landed:

1. **Login redirected to /provision/<id> instead of /dashboard.**
   auth_handover.go's redirect default still pointed at the legacy
   /console/dashboard path. The router's /auth/handover safety-net
   redirect, the index-route mode-aware redirect, and AuthCallbackPage
   all still navigated to /console/dashboard too. None of those routes
   exist on the Sovereign router any more (PR #972 deleted ConsolePage*),
   so the browser fell back to the closest matching prefix
   /provision/$deploymentId/...

2. **Sidebar Cloud → /provision//cloud (empty deploymentId).**
   SovereignSidebar.tsx's FLAT_NAV / SETTINGS_ITEM / SETTINGS_SUB_NAV
   all still pointed at /console/X paths that don't resolve. The
   browser fell through to the wizard sidebar's /provision/$id/cloud
   route, but with deploymentId resolved to '' (we're on Sovereign
   mode, no URL param), producing /provision//cloud.

3. **Clean roots showed no data; data only at /provision/<id>/...**
   The /api/v1/sovereign/self endpoint returned 503
   deployment-id-not-yet-stamped because CATALYST_SELF_DEPLOYMENT_ID
   env was empty (orchestrator hasn't yet shipped the values-overlay
   write that stamps it via the chart). useResolvedDeploymentId
   resolved null, every page that depends on it (Dashboard, Jobs,
   Cloud, etc.) had no id to fetch with.

Fixes:
- auth_handover.go + handler.go + auth_handover_test.go: redirect
  default /dashboard.
- router.tsx + AuthCallbackPage.tsx: index + handover safety-net +
  callback all redirect to /dashboard.
- SovereignSidebar.tsx: FLAT_NAV / SETTINGS / SETTINGS_SUB_NAV use
  clean roots; deriveActiveSection regexes match clean roots.
- SovereignConsoleLayout.tsx: Settings dropdown nav target /settings.
- cloudListShared.tsx + CloudNetworkPage.tsx + CloudStoragePage.tsx:
  Links use mode-aware path (sovereignPath helper for the back-link;
  inline DETECTED_MODE branch for the deeper sub-route tile links).
- sovereign_self.go: store-fallback resolution — when env is empty
  but the local store holds a deployment record whose SovereignFQDN
  matches CATALYST_OTECH_FQDN, return that record's id. The cutover
  import endpoint enforces FQDN match before persisting, so a single
  matching record is unambiguously this Sovereign's. This makes data
  flow on clean URLs the moment fireHandover's POST /import lands,
  without waiting for a chart-values overlay write + Flux reconcile.

Closes the user-reported "actual data is still staying in the cilder
of the mother concept under provisioning urls" + "clicking on cloud
goes to /provision//cloud" symptoms on otech117.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(catalyst-ui): SovereignConsoleLayout redirects to / on mothership instead of looping on "Authenticating…" (#975)

When the operator hits a clean-root Sovereign-Console route (/dashboard,
/apps, etc.) on the mothership (console.openova.io), DETECTED_MODE
returns sovereignFQDN=null — those routes exist for the per-Sovereign
self-mode SPA mounted at console.<sov-fqdn>, not for catalyst-zero.

Without an FQDN there is no Keycloak realm to OIDC against, so initAuth
would set authState='unauthenticated' and the layout's loading branch
rendered the spinner with "Authenticating…" caption forever — the
hang the founder hit immediately after #976 + #975 deploys when
clicking any dashboard/apps/cloud link on the mothership.

Redirect to / instead so the operator lands on the wizard /
deployments list, which is the right surface for catalyst-zero.

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 21:57:13 +04:00
github-actions[bot]
ee3b9cfe90 deploy: update catalyst images to cb115d7 2026-05-05 17:45:09 +00:00
e3mrah
cb115d77b0
deploy(contabo): release pin to :b4fb6cf — k8scache discovery probe removed (#982)
Restores forward roll of the catalyst-{api,ui} Kustomize-path image
refs after the hotfix landed:

- 3b88dfa hotfix(catalyst-api): drop k8scache discovery probe
- b4fb6cf fix(catalyst-ui): drop stale params={{ deploymentId }}

Per #980, contabo Kustomize-path image refs are managed manually
(catalyst-build only auto-bumps values.yaml). This commit is the
manual forward-roll.

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 21:42:42 +04:00
github-actions[bot]
e2f849ecf0 deploy: update catalyst images to b4fb6cf 2026-05-05 17:40:20 +00:00
e3mrah
3b88dfa75f
hotfix(catalyst-api): drop k8scache discovery probe — unblocks contabo startup (#975) (#981)
Bug: contabo mothership stuck during catalyst-api boot, "iterating dead
clusters". Root cause is a regression introduced by the k8scache PR:
AddCluster gained a synchronous `core.Discovery().ServerResourcesForGroupVersion(gv)`
call to gate Optional kinds (metrics.k8s.io/PodMetrics) — that call
issues a REST GET against the cluster's apiserver with NO context
timeout. On a kubeconfig pointing at a dead machine (a decommissioned
otech whose <id>.yaml was never removed) the call hangs until the
underlying TCP connect times out (often minutes). With many dead
kubeconfigs in /var/lib/catalyst/kubeconfigs the boot path serially
blocks for tens of minutes.

Fix:
- Drop the discovery probe block entirely. AddCluster is again
  synchronous-network-free; informers spawn unconditionally and
  reflectors handle missing GVRs (404 from the apiserver) with their
  own backoff retry loop in goroutines that don't block startup.
- Drop PodMetrics from DefaultKinds. With the probe gone, an
  always-registered PodMetrics informer would log retry warnings
  forever on every Sovereign without metrics-server. Until a non-
  blocking activation path lands the dashboard's color_by=utilization
  returns null when no PodMetrics indexer exists; health/age/size
  paths still ride the Pod + PVC indexers untouched.
- Drop Kind.Optional field, the two probe-specific tests, and the
  fakediscovery import. Update TestDefaultKinds_GraphAndDashboardSurface
  to assert PodMetrics is *absent* from the defaults.
- Update dashboard_test.go's local Optional kind registration accordingly.

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
2026-05-05 21:35:12 +04:00
github-actions[bot]
a2d33f6a97 deploy: update catalyst images to 953ef82 2026-05-05 17:27:02 +00:00
e3mrah
953ef8290f
fix(catalyst-build): stop auto-bumping contabo Kustomize-path image refs (#980)
* fix(catalyst-ui): drop stale params={{ deploymentId }} from clean-root Links (#975)

#976 collapsed `to="/provision/$deploymentId/<page>"` to clean root
paths (`to="/<page>"`) but left the `params={{ deploymentId }}` prop
on every callsite, breaking the Vite tsc build with TS2353. Fixes:

- Drop `params={{ deploymentId }}` from Links whose target is now a
  parameterless clean root path (StatusStrip, AppDetail, AppsPage,
  DecommissionPage, FlowPage, JobDetail, JobsPage, JobsTimeline,
  SettingsPage, DeploymentsList).
- For Links whose `to` still uses `$componentId`/`$jobId`, cast
  `params` with `as never` to match the existing pattern in
  cloud-compute/cloud-network/cloud-storage/Sidebar/UserAccess
  (the dual-mount under provisionRoute + consoleLayoutRoute defeats
  TS's strict params inference; the runtime path is correct).
- Drop `deploymentId` prop + interface field from JobCard / JobRow /
  JobsTable / AppCard now that the Links don't need it; update test
  fixtures + the JobsTable row-link assertion to match the new
  clean `/jobs/$jobId` href.
- Drop the unused ArchEdgeType import in k8sAdapter (TS6196).
- Dashboard navigateToApp uses `as never` casts to align with the
  same pattern.

* fix(catalyst-build): stop auto-bumping contabo Kustomize-path image refs

Two paths consume the catalyst-api / catalyst-ui images:
1. bp-catalyst-platform OCI chart (Sovereigns) — values.yaml driven, tag
   in values.yaml is rendered at helm install time by Sovereign Flux.
2. contabo Kustomize-path — literal image refs in templates/api-deployment.yaml
   and templates/ui-deployment.yaml. Flux kustomize-controller on contabo
   reconciles those files directly.

The CI deploy step was bumping BOTH on every PR, which auto-rolled
contabo every time anyone merged a catalyst-api code change. On
2026-05-05 PR #975's k8scache feature broke contabo startup on the
auto-roll because contabo has 27 dead-Sovereign kubeconfigs that the
new code iterates synchronously at startup, blocking readiness.

Fix: keep the values.yaml bump (Sovereigns auto-pick-up via OCI chart
which is the right behaviour for fresh provisions). Drop the
templates/*-deployment.yaml bump so contabo only rolls when an
operator manually commits a validated SHA into those files.

Closes the auto-deploy-to-contabo blast radius on every PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:24:57 +04:00
e3mrah
bf602ea960
feat(catalyst-ui): cloud-graph K8s projection + dashboard squarer tiles (#975) (#978)
* feat(catalyst-ui): cloud-graph K8s projection + dashboard squarer tiles (#975)

Architecture graph (cloud?view=graph) — surface live K8s workloads:
- New widgets/architecture-graph/k8sAdapter.ts emits Pod / Deployment /
  StatefulSet / DaemonSet / Service / Ingress / Namespace / ConfigMap /
  PVC / Node graph nodes from a normalized K8s snapshot.
- Edge inference: Pod→WorkerNode runs-on (.spec.nodeName), Pod→
  Namespace member-of, Pod→Workload via ownerRef chain (collapsing the
  ReplicaSet hop to attribute Pods directly to their parent Deployment),
  Service→Pod routes-to (EndpointSlice when present, label-selector
  fallback otherwise), Ingress→Service flows-to, Pod→PVC attached-to,
  PVC→Volume.hcloud realizes via PV csi.volumeAttributes.
- mergeGraphs unions cloud-side and K8s-side adapter outputs and
  collapses the WorkerNode↔Node bridge by id; K8s status wins for
  liveness, cloud-side metadata for SKU.
- New widgets/architecture-graph/useK8sCacheStream.ts subscribes to
  /api/v1/sovereigns/{id}/k8s/stream?initialState=1 via EventSource,
  applies ADDED/MODIFIED/DELETED deltas to an in-memory Map snapshot,
  bumps a revision counter so the adapter recomputes only when
  events arrive. jsdom guard so component tests render without SSE.
- ArchitectureGraphPage wires both adapters; Pod/ConfigMap chips are
  default-off (DEFAULT_INACTIVE_TYPES) so the canvas isn't crowded
  before the operator opts in. New TUNABLE_TYPES include the K8s
  high-cardinality kinds.
- 13 new unit tests cover ownerRef chain, EndpointSlice+selector
  fallback, Ingress backend resolution, Pod→PVC, PVC→Volume.hcloud
  bridge, WorkerNode↔Node merge, edge dangling-endpoint filtering.

Dashboard (/dashboard) — square tiles + null-utilization rendering:
- Recharts <Treemap aspectRatio={1}/> so cells render close to square
  whenever the value distribution allows (founder feedback 2026-05-05).
- Cell renderers handle percentage===null: NULL_PERCENTAGE_FILL grey
  fill, '— %' label, tooltip "metrics-server not installed" when
  colorBy=utilization without metrics, "no data" otherwise.
- TreemapItem.percentage type is now number | null end-to-end.

Companion to #976 backend (k8scache prep + dashboard.go rewrite).

* fix(catalyst-ui): rip out hardcoded /provision/$deploymentId from internal Link components

Sidebar + JobsTable + AppsPage + JobsPage + JobsTimeline + JobDetail +
Dashboard + AppDetail + DecommissionPage + DeploymentsList +
SettingsPage + StatusStrip + FlowPage all had hardcoded
`to="/provision/$deploymentId/<page>"` references that bound the
operator to the mother view URL forever — clicking any link from a
Sovereign self-mode page would jump to the (non-existent on Sovereign)
mother provision URL.

Mass-replaced with clean root paths `to="/<page>"` so internal
navigation on a Sovereign child stays on clean URLs (/dashboard,
/apps, /jobs, /cloud, /users, /settings).

Also deleted the now-unused SovereignConsoleRedirect.tsx
(superseded by direct route mounting in router.tsx).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 21:03:11 +04:00
github-actions[bot]
ebde8f1eb9 deploy: update catalyst images to ed8872a 2026-05-05 16:53:23 +00:00
e3mrah
ed8872a15b
feat(catalyst-api): mother→child cutover data transfer at handover (#977)
The data half of the mother→child contract that PR #976 set up the
URL routing for. At handover the mother POSTs the full deployment
record (events, jobs history, HRs, cloud topology, kubeconfig meta)
to the child's POST /api/v1/internal/deployments/import — the child
persists it locally so its /api/v1/deployments/{id}/* endpoints
answer with byte-byte-identical data the operator sees on the mother
view at /sovereign/provision/<id>/<page>.

Result: on the child cluster, clean URLs (/dashboard, /apps, /jobs,
/cloud) render with REAL data (events, exec logs, job statuses,
treemap utilisation) instead of empty arrays.

- New endpoint: POST /api/v1/internal/deployments/import (child)
  Validates by FQDN match against CATALYST_OTECH_FQDN. Idempotent.
- Mother fireHandover() now posts the record to the child after the
  JWT mint as a fire-and-forget goroutine. Failure logs loudly per
  INVIOLABLE-PRINCIPLES #3 but does not block SSE emit.

Bumped: bp-catalyst-platform 1.4.27 → 1.4.28.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:51:03 +04:00
github-actions[bot]
c4bc7cac89 deploy: update catalyst images to 60e471b 2026-05-05 16:48:59 +00:00
e3mrah
60e471bcc7
feat(sovereign-console): clean root URLs on Sovereign children (#976)
* feat(catalyst-api): cache-driven dashboard treemap + watcher prep (#975)

Watcher prep (k8scache):
- Register persistentvolumes (PVC→Volume.hcloud bridge), replicasets
  (Deployment owner-ref hop), endpointslices (exact Service→Pod
  membership) in DefaultKinds.
- Register metrics.k8s.io/v1beta1.PodMetrics as Optional; AddCluster
  probes discovery and skips the informer when metrics-server is
  absent so the watch never crash-loops.
- Tests pin the mandatory + optional kind set.

Dashboard rewrite:
- Replace dashboardFixture slice with cache-driven aggregations off
  the same k8scache.Factory the SSE/REST surface uses.
- Resolve cluster id from deployment_id query param.
- Pod row projection: cpu/memory limits from container specs, storage
  from referenced PVCs, hasMetrics from PodMetrics availability.
- color_by=health: Σ Ready / total ×100 (pure cache, ships day one).
- color_by=age: now − min(creationTimestamp) normalised to 30d window.
- color_by=utilization: Σ usage / Σ limit; null when metrics absent
  → JSON null (Percentage *float64) → UI greys cell.
- group_by chains arbitrary depth via groupAtLevel recursion.
- Tests cover health, utilization-null, storage_limit-from-PVCs,
  family/application nesting, percentage-in-range guards.

Wire change: treemapItem.Percentage is now *float64 to encode the
metrics-absent path as JSON null. UI side updated in companion
commit.

* feat(sovereign-console): clean root URLs on Sovereign children — /dashboard, /apps, /jobs, /cloud, /users, /settings

Mother (contabo): /sovereign/provision/$childId/* (transient, manages
many children).  Child (Sovereign post-cutover): /* (clean root, self-
scoped — there's only one deployment, so no id in URL).

- Pathless layout route mounts SovereignConsoleLayout at root id
- Operator routes /dashboard, /apps, /apps/$cid, /jobs, /jobs/$jid,
  /cloud, /users, /users/new, /users/$name, /settings,
  /settings/marketplace, /catalog, /parent-domains, /sme/users,
  /sme/roles, /sme/tenants/new at root paths
- SovereignSidebar nav links updated from /console/* to clean /*
- sovereignPath() helper added for mode-aware Link/navigate calls
  (Sovereign emits clean URL, contabo emits /provision/$id/<page>)
- Active-section regex updated to match root paths

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:46:51 +04:00
github-actions[bot]
0092479c21 deploy: update catalyst images to 8a1fe04 2026-05-05 16:24:49 +00:00
e3mrah
8a1fe047b1
fix(catalyst-ui): drop unused SovereignConsoleRedirect import + idLoading var (#974)
Build #25388329130 failed on PR #972's merge SHA `6ec7851` with two
TS6133 unused-symbol errors:
  src/app/router.tsx(86,1): error TS6133: 'SovereignConsoleRedirect' is declared but its value is never read.
  src/pages/sovereign/Dashboard.tsx(133,46): error TS6133: 'idLoading' is declared but its value is never read.

The SovereignConsoleRedirect helper became unused once the /console/*
routes were wired directly to the canonical components (Dashboard,
AppsPage, JobsPage, CloudPage, UserAccessListPage, SettingsPage) in
the same PR. The Dashboard's idLoading binding was a leftover from an
earlier draft that surfaced a loading pill.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:21:31 +04:00
e3mrah
6ec7851bc2
feat(sovereign-console): kill duplicate /console/* pages, redirect to canonical /provision/$id/* (Iteration 1) (#972)
* feat(sovereign-console): kill duplicate /console/* pages, redirect to canonical /provision/$id/* (Iteration 1)

Founder-reported on otech116/117: the /console/dashboard, /console/apps,
/console/jobs, /console/cloud, /console/users, /console/settings pages
are STUBS that look completely different from the canonical Sovereign
Console operators see at console.openova.io/sovereign/provision/$id/*.

Investigation: 6 duplicate Console*Page React components were shipped in
PR #937 — separate stub implementations of pages that already exist as
the canonical Dashboard / AppsPage / JobsPage / CloudPage /
UserAccessListPage / SettingsPage components used by the
/provision/$deploymentId/* route tree (the same the wizard renders).

Fix (Iteration 1):
  - DELETE the 6 duplicate Console*Page components.
  - Replace the /console/* router routes with SovereignConsoleRedirect:
    a tiny component that fetches /api/v1/sovereign/self for the
    Sovereign's own deployment id, then router-navigates to the
    canonical /provision/<self-id>/<page>. Same components, same data,
    pixel-byte-byte-identical UI to the mothership view.
  - Add catalyst-api endpoint GET /api/v1/sovereign/self that returns
    the deployment id from CATALYST_SELF_DEPLOYMENT_ID env. Mothership
    (env unset) → 404. Sovereign with stamped id → 200. Sovereign
    pre-handover → 503 deployment-id-not-yet-stamped.
  - Wire env via the existing sovereign-fqdn ConfigMap (B1 PR #912):
    new key `selfDeploymentId`, sourced from
    .Values.global.sovereignSelfDeploymentId. Empty until the
    orchestrator's per-Sovereign overlay writer stamps it.
  - Add useResolvedDeploymentId React hook (URL params first, then
    /sovereign/self fallback) — wires Iteration 2 (clean URLs) below.

Iteration 2 (next PR — out of scope here):
  - Drop the /sovereign/provision/<id>/ URL prefix on Sovereign by
    refactoring 6 canonical components to use useResolvedDeploymentId
    instead of strict useParams. Then /console/dashboard renders the
    canonical Dashboard at the clean URL with deployment id resolved
    from /sovereign/self.

Iteration 3 (next PR after — also out of scope):
  - Handover history transfer: contabo's catalyst-api at handover POSTs
    the full deployment record (events, jobs, HRs, cloud topology) to
    the Sovereign's catalyst-api so /provision/<id>/* on the Sovereign
    answers with byte-byte-identical data.

Bumped: bp-catalyst-platform 1.4.26 → 1.4.27.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(sovereign-console): clean URLs — /console/* mounts canonical components directly

Removes the SovereignConsoleRedirect indirection. The 6 canonical
operator components (Dashboard, AppsPage, JobsPage, JobDetail,
CloudPage, AppDetail, UserAccessListPage, UserAccessEditPage,
SettingsPage) now render at clean /console/<page> URLs on Sovereign,
NOT under /sovereign/provision/<id>/<page>.

Pages that previously hard-coupled to the URL via
  useParams({ from: '/provision/$deploymentId/...' })
now use useResolvedDeploymentId() which:
  1. reads URL params (when on the legacy /provision/$id/* tree on
     contabo's mothership wizard)
  2. falls back to GET /api/v1/sovereign/self (Sovereign self-discovery)

Refactored: Dashboard, AppsPage, JobsPage, SettingsPage, UserAccessListPage.
CloudPage already used strict:false — no change needed.

Wires the /console/* router subtree to the canonical components +
adds the missing children routes (/jobs/$jobId, /users/new,
/users/$name, /app/$componentId) so the canonical UI's deep-links
work on the clean URL surface too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 20:17:36 +04:00
github-actions[bot]
9ed579d4ba deploy: update catalyst images to 3db19b7 2026-05-05 14:27:41 +00:00
e3mrah
3db19b76b1
fix(cutover 0.1.19): Step-01 gitea-mirror DNS readiness probe + backoffLimit=3 (#968) (#969)
## Root cause (live on otech115 2026-05-05 14:15)

After PR #959 (0.1.18) unblocked the auto-trigger to actually call
/internal/cutover/trigger, the cutover engine fired Step-01 within ~8s
of bp-self-sovereign-cutover Helm-install completing. The gitea Pod
had only just reached Ready state — cluster-DNS endpoint publication
for the headless service `gitea-http` was still in flight. One wget
returned `bad address gitea-http.gitea.svc.cluster.local` and exited
non-zero. Catalyst-api's cutover engine stamped Jobs with backoffLimit=0
(cutover.go:584), so a single DNS miss was terminal and aborted all 8
cutover steps. otech115 finished provisioning with cutoverComplete=false
and tethered to upstream github.com/ghcr.io.

## Fix (dual-layer)

**Layer A — catalyst-api (cutover.go)**: backoffLimit lifted from 0 to 3.
A single transient miss is recoverable (4 attempts over each step's
activeDeadlineSeconds) without burning operator-attention. Hard failures
still surface within budget.

**Layer B — chart Step-01 (01-gitea-mirror-job.yaml)**: explicit
nslookup readiness probe at the top of the bash script, before any
wget call. 30 attempts × 5s = 150s budget; alpine/git ships nslookup
in /usr/bin (verified live on otech115). Layer B is faster than Layer A
(in-script DNS retry vs Pod recreate); Layer A is the safety net for
any other transient pre-cluster-stable race we haven't yet enumerated.

## Acceptance gate

Test case 15 added to platform/self-sovereign-cutover/chart/tests/
cutover-contract.sh — guards against future regressions that drop
either the gitea_host extraction or the nslookup loop.

## Live verification

Will fire on the next provision (otech116). Expected:
- Step-01 logs `[gitea-mirror] DNS ready for gitea-http.gitea.svc.cluster.local (attempt N)`
- All 8 cutover Jobs reach Complete
- self-sovereign-cutover-status ConfigMap reaches cutoverComplete=true

Co-authored-by: e3mrah <ebaysal@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 18:25:15 +04:00
github-actions[bot]
39732ff41b deploy: update catalyst images to 8e312cd 2026-05-05 14:01:12 +00:00
github-actions[bot]
aebf40b589 deploy: update catalyst images to d1431be 2026-05-05 12:25:07 +00:00
e3mrah
d1431bed09
fix(autoscaler+wizard): wire HCLOUD_CLOUD_INIT, validate SKU/region in catalyst-api (#965)
Closes #921 — bp-cluster-autoscaler-hcloud chart shipped without
HCLOUD_CLUSTER_CONFIG / HCLOUD_CLOUD_INIT, so cluster-autoscaler 1.32.x
FATALs at startup with "HCLOUD_CLUSTER_CONFIG or HCLOUD_CLOUD_INIT is
not specified" on every Sovereign (otech112 evidence). HelmRelease
reports Ready=True (Helm install succeeded) but the Pod
CrashLoopBackOffs invisibly behind the False-positive condition.

Closes #916 — wizard let operators dispatch unbuildable topologies
(otech109: cpx32 worker in `ash`) because PROVIDER_NODE_SIZES did not
encode regional orderability. Hetzner rejected the worker creation 41s
into `tofu apply` after Phase-0 had already created the CP + network +
LB + firewall.

Chart fix (issue #921):
- Add `clusterAutoscalerHcloud.{clusterConfig,cloudInit}` values to the
  umbrella chart (base64-encoded per upstream contract).
- Render `hetzner-node-config` Secret unconditionally with both keys so
  the upstream Deployment's secretKeyRef references resolve cleanly
  during `helm template` AND in the live cluster regardless of overlay
  state.
- Wire HCLOUD_CLUSTER_CONFIG + HCLOUD_CLOUD_INIT extraEnvSecrets onto
  the upstream chart's deployment.
- Tofu Phase 0 base64-encodes the Phase-0 worker cloud-init and stamps
  it under `flux-system/cloud-credentials.hcloud-cloud-init`; the
  bootstrap-kit overlay lifts that key via Flux `valuesFrom` into
  `clusterAutoscalerHcloud.cloudInit`. Autoscaler-spawned workers thus
  receive the IDENTICAL bootstrap as the Phase-0 worker fleet.
- Bump bp-cluster-autoscaler-hcloud chart 1.0.0 → 1.1.0.
- Chart-test smoke gate (chart/tests/hetzner-node-config.sh) verifies
  Secret + env var wiring + no-regression of HCLOUD_TOKEN — runs in CI's
  blueprint-release "Run chart integration tests" step.

Wizard fix (issue #916):
- Add `availableRegions?: string[]` to NodeSize interface; encode
  cpx32 = ['fsn1','nbg1','hel1'], cpx21/cpx31 = [] (orderable nowhere
  new) per Hetzner /v1/server_types vs POST /v1/servers gap.
- Add `isSkuAvailableInRegion()` + `suggestAlternativeSkus()` helpers.
- StepProvider filters SKU dropdowns by selected region; auto-swaps
  current SKU to recommended default when region change drops it out
  of orderability.
- Mirror the matrix Go-side in sku_availability.go; gate
  `provisioner.Request.Validate()` with same predicate so a stale
  wizard build OR direct API caller bypassing the UI cannot dispatch
  otech109's failure mode.
- Two-sided enforcement covers both r.Regions[] (multi-region) and the
  legacy singular path.

Tests: 13 vitest cases on the wizard side + 38 Go subtests on the API
side. Chart smoke renders + helm template gates the env wiring at
publish time.

Co-authored-by: hatiyildiz <hati.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:21:59 +04:00
github-actions[bot]
65be6dea78 deploy: update catalyst images to 3de3786 2026-05-05 12:17:51 +00:00