feat(sovereign-console): clean root URLs on Sovereign children (#976)

* feat(catalyst-api): cache-driven dashboard treemap + watcher prep (#975)

Watcher prep (k8scache):
- Register persistentvolumes (PVC→Volume.hcloud bridge), replicasets
  (Deployment owner-ref hop), endpointslices (exact Service→Pod
  membership) in DefaultKinds.
- Register metrics.k8s.io/v1beta1.PodMetrics as Optional; AddCluster
  probes discovery and skips the informer when metrics-server is
  absent so the watch never crash-loops.
- Tests pin the mandatory + optional kind set.

Dashboard rewrite:
- Replace dashboardFixture slice with cache-driven aggregations off
  the same k8scache.Factory the SSE/REST surface uses.
- Resolve cluster id from deployment_id query param.
- Pod row projection: cpu/memory limits from container specs, storage
  from referenced PVCs, hasMetrics from PodMetrics availability.
- color_by=health: Σ Ready / total ×100 (pure cache, ships day one).
- color_by=age: now − min(creationTimestamp) normalised to 30d window.
- color_by=utilization: Σ usage / Σ limit; null when metrics absent
  → JSON null (Percentage *float64) → UI greys cell.
- group_by chains arbitrary depth via groupAtLevel recursion.
- Tests cover health, utilization-null, storage_limit-from-PVCs,
  family/application nesting, percentage-in-range guards.

Wire change: treemapItem.Percentage is now *float64 to encode the
metrics-absent path as JSON null. UI side updated in companion
commit.

* feat(sovereign-console): clean root URLs on Sovereign children — /dashboard, /apps, /jobs, /cloud, /users, /settings

Mother (contabo): /sovereign/provision/$childId/* (transient, manages
many children).  Child (Sovereign post-cutover): /* (clean root, self-
scoped — there's only one deployment, so no id in URL).

- Pathless layout route mounts SovereignConsoleLayout at root id
- Operator routes /dashboard, /apps, /apps/$cid, /jobs, /jobs/$jid,
  /cloud, /users, /users/new, /users/$name, /settings,
  /settings/marketplace, /catalog, /parent-domains, /sme/users,
  /sme/roles, /sme/tenants/new at root paths
- SovereignSidebar nav links updated from /console/* to clean /*
- sovereignPath() helper added for mode-aware Link/navigate calls
  (Sovereign emits clean URL, contabo emits /provision/$id/<page>)
- Active-section regex updated to match root paths

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatiyildiz@users.noreply.github.com>
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
e3mrah 2026-05-05 20:46:51 +04:00 committed by GitHub
parent 0092479c21
commit 60e471bcc7
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
9 changed files with 1172 additions and 274 deletions

View File

@ -7,52 +7,76 @@
// contract in
// products/catalyst/bootstrap/ui/src/lib/treemap.types.ts
//
// ── Data path (target state) ─────────────────────────────────────────
// ── Data path ─────────────────────────────────────────────────────────
//
// The target state walks each registered Sovereign's kubeconfig, hits
// metrics-server for live pod CPU/memory, sums against
// `resources.limits.{cpu,memory}` per workload, and groups by the
// requested dimensions. The kubeconfig POST-back endpoint
// PUT /api/v1/deployments/{id}/kubeconfig
// delivers each Sovereign's kubeconfig to the same PVC the dashboard
// reads from at request time.
// Per ADR-0001 §5 the kube-apiserver is the system of record. The
// dashboard reads from the in-process k8scache.Factory's Indexer (one
// dynamicinformer.SharedInformerFactory per Sovereign cluster), NOT
// the apiserver directly. Pods, PVCs, and (when metrics-server is
// installed) PodMetrics are all served straight from cache — sub-ms
// per request, event-driven freshness via the same WATCH stream that
// powers the SSE endpoint.
//
// ── v1 placeholder (this file) ───────────────────────────────────────
// `deployment_id` resolves to the k8scache cluster id — the kubeconfig
// file stem, which by construction is the deployment id (see
// PutKubeconfig handler).
//
// metrics-server is NOT yet trivially reachable from catalyst-api in
// every Sovereign profile (the bootstrap kit does NOT install it; it's
// an optional add-on). Until the metrics-server query path lands as a
// dedicated work item, this handler returns a STATIC SHAPE with
// realistic numbers so the dashboard UI can ship and be screenshot-
// validated. Every cell carries:
// When the cache is not wired (test/CI without a real cluster) or the
// requested deployment_id is not registered, the handler returns a
// well-shaped empty response. The UI renders the "no utilisation data
// yet" empty state.
//
// - A representative `count` (replicas)
// - A `size_value` derived from a typical Helm chart's
// `resources.requests` for the named application
// - A `percentage` synthesised so the gradient covers blue, green
// and red regions (so the UI proves the colour map at runtime)
// ── color_by semantics ───────────────────────────────────────────────
//
// TODO(catalyst-api): replace this static path with the metrics-server
// integration. Tracked in the dashboard-treemap follow-up issue. The
// HTTP shape must NOT change — the UI is wired against this contract.
// • health — Σ Ready pods / total ×100. Pure cache data;
// ships day-one. Frontend healthColor() flips so
// 100 → green, 0 → red.
// • age — (now min(creationTimestamp)) normalised to
// [0..AGE_NORMALISE_DAYS]. Frontend ageColor() goes
// blue → green → red as the value rises.
// • utilization — Σ pod cpu (or memory, mirroring size_by) / Σ pod
// limit ×100. Reads from PodMetrics. When metrics-
// server is absent the percentage is JSON null and
// the UI greys the cell with a tooltip.
//
// Per docs/INVIOLABLE-PRINCIPLES.md #1 (waterfall, not iterative MVP),
// the JSON shape is the target shape from day one. Only the data SOURCE
// is a placeholder; the schema is final.
// Per docs/INVIOLABLE-PRINCIPLES.md:
//
// #1 (waterfall) — every group_by × color_by × size_by lands in
// one cut. No "for now" stub.
// #2 (quality) — fixture data is gone; every cell traces to a
// real Pod or PVC in the live cluster.
// #3 (event-driven) — no apiserver hits. Every byte comes from the
// informer's Indexer.
// #4 (never hardcode) — AGE_NORMALISE_DAYS is the only window
// constant and it lives at the top of this file as a named const.
package handler
import (
"net/http"
"sort"
"strings"
"time"
apiresource "k8s.io/apimachinery/pkg/api/resource"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/labels"
)
// AgeNormaliseDays — the upper bound for the `age` color metric.
// A pod with creationTimestamp 0 days ago maps to percentage 0 (blue);
// a pod older than this many days maps to percentage 100 (red). The
// gradient between is linear.
const AgeNormaliseDays = 30.0
// treemapItem is the wire shape — kept package-private with json tags
// matching the TS interface verbatim.
// matching the TS interface verbatim. Percentage is a pointer so the
// "no utilisation data" path can encode JSON null without an
// out-of-band sentinel.
type treemapItem struct {
ID *string `json:"id"`
Name string `json:"name"`
Count int `json:"count"`
Percentage float64 `json:"percentage"`
Percentage *float64 `json:"percentage"`
SizeValue float64 `json:"size_value,omitempty"`
Children []treemapItem `json:"children,omitempty"`
}
@ -87,10 +111,9 @@ var dashboardColorBy = map[string]struct{}{
// GetDashboardTreemap handles GET /api/v1/dashboard/treemap.
//
// Validates the query string, then synthesises a realistic placeholder
// tree (see file header). Every leaf cell is an Application; the
// outer-layer dimension is whatever the operator requested first. When
// only one layer is requested, a flat list of leaves is returned.
// Validates the query string, then aggregates Pods + PVCs from the
// k8scache.Factory's Indexer into a nested treemap shaped per the UI
// contract.
func (h *Handler) GetDashboardTreemap(w http.ResponseWriter, r *http.Request) {
q := r.URL.Query()
groupByRaw := strings.TrimSpace(q.Get("group_by"))
@ -98,8 +121,10 @@ func (h *Handler) GetDashboardTreemap(w http.ResponseWriter, r *http.Request) {
groupByRaw = "application"
}
groupBy := strings.Split(groupByRaw, ",")
for _, g := range groupBy {
if _, ok := dashboardDimension[strings.TrimSpace(g)]; !ok {
for i, g := range groupBy {
g = strings.TrimSpace(g)
groupBy[i] = g
if _, ok := dashboardDimension[g]; !ok {
writeJSON(w, http.StatusBadRequest, map[string]string{
"error": "invalid-group-by",
"detail": "unsupported dimension: " + g,
@ -132,215 +157,407 @@ func (h *Handler) GetDashboardTreemap(w http.ResponseWriter, r *http.Request) {
return
}
resp := buildPlaceholderTree(groupBy, sizeBy)
// Resolve cluster id from deployment_id. Empty deployment_id or
// unregistered cluster → well-shaped empty response (UI shows
// the empty state).
clusterID := strings.TrimSpace(q.Get("deployment_id"))
if clusterID == "" || h.k8sCache == nil || !h.k8sCacheHasCluster(clusterID) {
writeJSON(w, http.StatusOK, treemapResponse{Items: []treemapItem{}, TotalCount: 0})
return
}
pods, _, _ := h.k8sCache.List(clusterID, "pod", labels.Everything())
pvcs, _, _ := h.k8sCache.List(clusterID, "persistentvolumeclaim", labels.Everything())
// PodMetrics is Optional — list may error when metrics-server is
// absent. Treat as nil and the utilization path emits null.
podMetrics, _, _ := h.k8sCache.List(clusterID, "podmetrics", labels.Everything())
rows := buildPodRows(pods, pvcs, podMetrics, clusterID)
resp := aggregateRows(rows, groupBy, colorBy, sizeBy)
writeJSON(w, http.StatusOK, resp)
}
// placeholder tree — keeps the schema honest and gives the UI a
// recognisable shape (~30 cells nested 2-deep, ~12 cells flat).
// podRow is one pod's contribution to the treemap. Built once,
// consumed by every group_by × color_by × size_by aggregation.
type podRow struct {
namespace string
application string // app.kubernetes.io/instance OR top-level ownerRef name
family string // catalyst.openova.io/family (default "other")
cluster string // cluster id (single-Sovereign per page today)
cpuLim float64 // millicores summed across containers
memLim float64 // bytes
storageLim float64 // bytes — sum of attached PVC requests
cpuUsage float64 // from PodMetrics; 0 when absent
memUsage float64 // from PodMetrics; 0 when absent
hasMetrics bool // true when PodMetrics observed for this pod
isReady bool
createdAt time.Time
}
// buildPodRows projects raw cache objects into the row shape. Pods
// without a Ready condition are still counted (they contribute 0 to
// the health numerator). PVCs are matched by namespace + claim name
// from each pod's spec.volumes[].
func buildPodRows(pods, pvcs, podMetrics []*unstructured.Unstructured, clusterID string) []podRow {
pvcByKey := map[string]*unstructured.Unstructured{}
for _, p := range pvcs {
key := p.GetNamespace() + "/" + p.GetName()
pvcByKey[key] = p
}
metricsByKey := map[string]*unstructured.Unstructured{}
for _, m := range podMetrics {
key := m.GetNamespace() + "/" + m.GetName()
metricsByKey[key] = m
}
out := make([]podRow, 0, len(pods))
for _, p := range pods {
row := podRow{
namespace: p.GetNamespace(),
cluster: clusterID,
application: applicationKey(p),
family: stringLabel(p, "catalyst.openova.io/family", "other"),
isReady: podIsReady(p),
createdAt: p.GetCreationTimestamp().Time,
}
// Sum container limits.
containers, _, _ := unstructured.NestedSlice(p.Object, "spec", "containers")
for _, ci := range containers {
c, ok := ci.(map[string]any)
if !ok {
continue
}
limits, _, _ := unstructured.NestedStringMap(c, "resources", "limits")
row.cpuLim += parseQuantityMillicores(limits["cpu"])
row.memLim += parseQuantityBytes(limits["memory"])
}
// Sum attached PVC storage.
volumes, _, _ := unstructured.NestedSlice(p.Object, "spec", "volumes")
for _, vi := range volumes {
v, ok := vi.(map[string]any)
if !ok {
continue
}
pvcName, _, _ := unstructured.NestedString(v, "persistentVolumeClaim", "claimName")
if pvcName == "" {
continue
}
pvc, ok := pvcByKey[p.GetNamespace()+"/"+pvcName]
if !ok {
continue
}
storage, _, _ := unstructured.NestedString(pvc.Object, "spec", "resources", "requests", "storage")
row.storageLim += parseQuantityBytes(storage)
}
// Pod metrics — when metrics-server is installed.
if mm, ok := metricsByKey[p.GetNamespace()+"/"+p.GetName()]; ok {
row.hasMetrics = true
mContainers, _, _ := unstructured.NestedSlice(mm.Object, "containers")
for _, ci := range mContainers {
c, ok := ci.(map[string]any)
if !ok {
continue
}
usage, _, _ := unstructured.NestedStringMap(c, "usage")
row.cpuUsage += parseQuantityMillicores(usage["cpu"])
row.memUsage += parseQuantityBytes(usage["memory"])
}
}
out = append(out, row)
}
return out
}
// applicationKey returns the application identifier per the chart-
// authoring convention. Order of precedence:
//
// The fixture is keyed off the canonical Catalyst-Zero family list so
// the Dashboard renders meaningful application names even before the
// metrics-server integration lands. Kept inside this Go file (not a
// JSON fixture) so it ships with the binary and never depends on a
// bind-mounted file.
type appFixture struct {
id string
name string
family string
namespace string
cluster string
cpuLimit float64 // millicores
memLimit float64 // bytes
storage float64 // bytes
replicas int
utilizPct float64
healthPct float64
agePct float64
// 1. label app.kubernetes.io/instance (set by Helm and most chart
// authors); this is what `group_by=application` should bucket on.
// 2. top-level ownerRef Kind+Name when no instance label is set.
// Daemonset/Statefulset/Deployment/Job all get hit; the
// ReplicaSet hop is collapsed by walking the RS ownerRef chain
// would require a second cache lookup — we treat the pod's first
// ownerRef as the application unit instead, which is correct for
// all bp-* charts in the catalyst registry.
// 3. the pod's own name when unowned (rare — DaemonSet stub pods,
// statically-defined pods).
func applicationKey(p *unstructured.Unstructured) string {
if v := p.GetLabels()["app.kubernetes.io/instance"]; v != "" {
return v
}
if v := p.GetLabels()["app.kubernetes.io/name"]; v != "" {
return v
}
for _, ref := range p.GetOwnerReferences() {
if ref.Name != "" {
return ref.Name
}
}
return p.GetName()
}
var dashboardFixture = []appFixture{
// SPINE
{id: "bp-cilium", name: "cilium", family: "spine", namespace: "kube-system", cluster: "omantel-mkt", cpuLimit: 1500, memLimit: 1.5 * 1024 * 1024 * 1024, storage: 0, replicas: 3, utilizPct: 62, healthPct: 100, agePct: 28},
{id: "bp-cert-manager", name: "cert-manager", family: "spine", namespace: "cert-manager", cluster: "omantel-mkt", cpuLimit: 200, memLimit: 256 * 1024 * 1024, storage: 0, replicas: 1, utilizPct: 18, healthPct: 100, agePct: 28},
{id: "bp-flux", name: "flux", family: "spine", namespace: "flux-system", cluster: "omantel-mkt", cpuLimit: 500, memLimit: 512 * 1024 * 1024, storage: 0, replicas: 4, utilizPct: 47, healthPct: 100, agePct: 28},
{id: "bp-crossplane", name: "crossplane", family: "spine", namespace: "crossplane-system", cluster: "omantel-mkt", cpuLimit: 300, memLimit: 512 * 1024 * 1024, storage: 0, replicas: 1, utilizPct: 22, healthPct: 100, agePct: 28},
{id: "bp-sealed-secrets", name: "sealed-secrets", family: "spine", namespace: "sealed-secrets", cluster: "omantel-mkt", cpuLimit: 100, memLimit: 128 * 1024 * 1024, storage: 0, replicas: 1, utilizPct: 9, healthPct: 100, agePct: 28},
// PILOT (auth + service mesh)
{id: "bp-keycloak", name: "keycloak", family: "pilot", namespace: "auth", cluster: "omantel-mkt", cpuLimit: 1000, memLimit: 2 * 1024 * 1024 * 1024, storage: 5 * 1024 * 1024 * 1024, replicas: 2, utilizPct: 71, healthPct: 100, agePct: 14},
{id: "bp-spire", name: "spire", family: "pilot", namespace: "spire-system", cluster: "omantel-mkt", cpuLimit: 200, memLimit: 256 * 1024 * 1024, storage: 1 * 1024 * 1024 * 1024, replicas: 1, utilizPct: 33, healthPct: 100, agePct: 14},
{id: "bp-openbao", name: "openbao", family: "pilot", namespace: "openbao", cluster: "omantel-mkt", cpuLimit: 500, memLimit: 1024 * 1024 * 1024, storage: 10 * 1024 * 1024 * 1024, replicas: 3, utilizPct: 54, healthPct: 100, agePct: 14},
// FABRIC (event/data spine)
{id: "bp-nats-jetstream", name: "nats-jetstream", family: "fabric", namespace: "nats", cluster: "omantel-mkt", cpuLimit: 600, memLimit: 1024 * 1024 * 1024, storage: 20 * 1024 * 1024 * 1024, replicas: 3, utilizPct: 81, healthPct: 100, agePct: 14},
{id: "bp-gitea", name: "gitea", family: "fabric", namespace: "gitea", cluster: "omantel-mkt", cpuLimit: 300, memLimit: 512 * 1024 * 1024, storage: 15 * 1024 * 1024 * 1024, replicas: 1, utilizPct: 41, healthPct: 100, agePct: 14},
{id: "bp-cnpg", name: "cnpg", family: "fabric", namespace: "cnpg-system", cluster: "omantel-mkt", cpuLimit: 800, memLimit: 2 * 1024 * 1024 * 1024, storage: 50 * 1024 * 1024 * 1024, replicas: 3, utilizPct: 67, healthPct: 100, agePct: 14},
{id: "bp-seaweedfs", name: "seaweedfs", family: "fabric", namespace: "seaweedfs", cluster: "omantel-mkt", cpuLimit: 400, memLimit: 1024 * 1024 * 1024, storage: 100 * 1024 * 1024 * 1024, replicas: 3, utilizPct: 38, healthPct: 100, agePct: 14},
// CORTEX (AI / ML serving)
{id: "bp-kserve", name: "kserve", family: "cortex", namespace: "kserve", cluster: "omantel-mkt", cpuLimit: 2000, memLimit: 4 * 1024 * 1024 * 1024, storage: 0, replicas: 2, utilizPct: 92, healthPct: 75, agePct: 7},
{id: "bp-axon", name: "axon", family: "cortex", namespace: "axon", cluster: "omantel-mkt", cpuLimit: 1500, memLimit: 3 * 1024 * 1024 * 1024, storage: 0, replicas: 2, utilizPct: 88, healthPct: 100, agePct: 7},
// OBSERVABILITY
{id: "bp-prometheus", name: "prometheus", family: "observability", namespace: "observability", cluster: "omantel-mkt", cpuLimit: 1000, memLimit: 2 * 1024 * 1024 * 1024, storage: 30 * 1024 * 1024 * 1024, replicas: 1, utilizPct: 76, healthPct: 100, agePct: 14},
{id: "bp-grafana", name: "grafana", family: "observability", namespace: "observability", cluster: "omantel-mkt", cpuLimit: 200, memLimit: 256 * 1024 * 1024, storage: 1 * 1024 * 1024 * 1024, replicas: 1, utilizPct: 29, healthPct: 100, agePct: 14},
{id: "bp-tempo", name: "tempo", family: "observability", namespace: "observability", cluster: "omantel-mkt", cpuLimit: 400, memLimit: 1024 * 1024 * 1024, storage: 20 * 1024 * 1024 * 1024, replicas: 1, utilizPct: 43, healthPct: 100, agePct: 14},
{id: "bp-loki", name: "loki", family: "observability", namespace: "observability", cluster: "omantel-mkt", cpuLimit: 500, memLimit: 1024 * 1024 * 1024, storage: 50 * 1024 * 1024 * 1024, replicas: 1, utilizPct: 58, healthPct: 100, agePct: 14},
// SECURITY
{id: "bp-coraza", name: "coraza", family: "security", namespace: "ingress", cluster: "omantel-mkt", cpuLimit: 200, memLimit: 256 * 1024 * 1024, storage: 0, replicas: 2, utilizPct: 26, healthPct: 100, agePct: 7},
{id: "bp-syft-grype", name: "syft-grype", family: "security", namespace: "security", cluster: "omantel-mkt", cpuLimit: 100, memLimit: 256 * 1024 * 1024, storage: 5 * 1024 * 1024 * 1024, replicas: 1, utilizPct: 12, healthPct: 100, agePct: 7},
func stringLabel(p *unstructured.Unstructured, key, fallback string) string {
if v, ok := p.GetLabels()[key]; ok && v != "" {
return v
}
return fallback
}
func buildPlaceholderTree(groupBy []string, sizeBy string) treemapResponse {
func podIsReady(p *unstructured.Unstructured) bool {
conds, _, _ := unstructured.NestedSlice(p.Object, "status", "conditions")
for _, ci := range conds {
c, ok := ci.(map[string]any)
if !ok {
continue
}
t, _, _ := unstructured.NestedString(c, "type")
s, _, _ := unstructured.NestedString(c, "status")
if t == "Ready" {
return s == "True"
}
}
return false
}
// parseQuantityMillicores converts a K8s quantity string ("100m",
// "1", "2.5") to millicores. Empty / unparseable → 0.
func parseQuantityMillicores(s string) float64 {
if s == "" {
return 0
}
q, err := apiresource.ParseQuantity(s)
if err != nil {
return 0
}
return float64(q.MilliValue())
}
// parseQuantityBytes converts a K8s quantity string ("256Mi", "1Gi")
// to bytes. Empty / unparseable → 0.
func parseQuantityBytes(s string) float64 {
if s == "" {
return 0
}
q, err := apiresource.ParseQuantity(s)
if err != nil {
return 0
}
v, ok := q.AsInt64()
if !ok {
return q.AsApproximateFloat64()
}
return float64(v)
}
/* ── Aggregation ─────────────────────────────────────────────────── */
// aggregateRows groups rows by the requested group_by chain and
// computes size + percentage per bucket.
func aggregateRows(rows []podRow, groupBy []string, colorBy, sizeBy string) treemapResponse {
if len(groupBy) == 0 {
groupBy = []string{"application"}
}
// Single-layer flat list when only one layer is requested.
if len(groupBy) == 1 {
dim := strings.TrimSpace(groupBy[0])
items := groupFlat(dashboardFixture, dim, sizeBy)
return treemapResponse{
Items: items,
TotalCount: leafCount(items),
}
}
// Two+ layer nested list — group by the FIRST dimension, then for
// each parent group recurse with the remaining dimensions. The
// placeholder caps the recursion at 2 layers (the deepest the
// fixture meaningfully discriminates) — additional layers fold
// into the second.
outer := strings.TrimSpace(groupBy[0])
inner := strings.TrimSpace(groupBy[1])
parents := groupParents(dashboardFixture, outer)
out := make([]treemapItem, 0, len(parents))
for _, p := range parents {
children := groupFlat(p.rows, inner, sizeBy)
// Compute parent rollup. count = sum of children counts;
// percentage = mean of child percentages weighted by size.
parent := rollupParent(p.id, p.name, children)
parent.Children = children
out = append(out, parent)
}
return treemapResponse{
Items: out,
TotalCount: leafCount(out),
}
items := groupAtLevel(rows, groupBy, 0, colorBy, sizeBy)
return treemapResponse{Items: items, TotalCount: leafCount(items)}
}
type parentBucket struct {
type bucket struct {
id string
name string
rows []appFixture
rows []podRow
}
func groupParents(rows []appFixture, dim string) []parentBucket {
idx := map[string]*parentBucket{}
order := []string{}
for _, r := range rows {
key, name := dimensionKey(r, dim)
if _, ok := idx[key]; !ok {
idx[key] = &parentBucket{id: key, name: name}
order = append(order, key)
}
idx[key].rows = append(idx[key].rows, r)
// groupAtLevel walks the group_by chain. At each depth it buckets the
// rows by the dimension at `level`, computes size+percentage, and
// recurses for the next level.
func groupAtLevel(rows []podRow, groupBy []string, level int, colorBy, sizeBy string) []treemapItem {
if level >= len(groupBy) || len(rows) == 0 {
return nil
}
out := make([]parentBucket, 0, len(order))
for _, k := range order {
out = append(out, *idx[k])
dim := groupBy[level]
buckets := bucketRows(rows, dim)
out := make([]treemapItem, 0, len(buckets))
for _, b := range buckets {
size := sumSize(b.rows, sizeBy)
pct := computePercentage(b.rows, colorBy)
idCopy := b.id
item := treemapItem{
ID: &idCopy,
Name: b.name,
Count: countContribution(b.rows, sizeBy),
SizeValue: size,
Percentage: pct,
}
if level+1 < len(groupBy) {
item.Children = groupAtLevel(b.rows, groupBy, level+1, colorBy, sizeBy)
}
out = append(out, item)
}
return out
}
func groupFlat(rows []appFixture, dim, sizeBy string) []treemapItem {
idx := map[string]*treemapItem{}
func bucketRows(rows []podRow, dim string) []bucket {
idx := map[string]*bucket{}
order := []string{}
for _, r := range rows {
key, name := dimensionKey(r, dim)
if _, ok := idx[key]; !ok {
idCopy := key
idx[key] = &treemapItem{ID: &idCopy, Name: name}
order = append(order, key)
}
// Aggregate
size := sizeValueFor(r, sizeBy)
idx[key].SizeValue += size
idx[key].Count += r.replicas
// Weighted-average percentage.
// First arrival sets value; subsequent arrivals weight by size.
if idx[key].Percentage == 0 {
idx[key].Percentage = percentageFor(r)
} else {
// Running weighted mean.
prevSize := idx[key].SizeValue - size
if prevSize > 0 {
idx[key].Percentage = (idx[key].Percentage*prevSize + percentageFor(r)*size) / idx[key].SizeValue
}
id, name := dimensionKey(r, dim)
if _, ok := idx[id]; !ok {
idx[id] = &bucket{id: id, name: name}
order = append(order, id)
}
idx[id].rows = append(idx[id].rows, r)
}
// Note: percentageFor closes over color metric via a package-level
// indirection — see below.
out := make([]treemapItem, 0, len(order))
out := make([]bucket, 0, len(order))
for _, k := range order {
out = append(out, *idx[k])
}
// Stable order for deterministic responses (tests + cache headers).
sort.SliceStable(out, func(i, j int) bool { return out[i].name < out[j].name })
return out
}
func dimensionKey(r appFixture, dim string) (string, string) {
func dimensionKey(r podRow, dim string) (string, string) {
switch dim {
case "sovereign":
// Single-Sovereign placeholder; one bucket.
return "sovereign-this", "this Sovereign"
return r.cluster, r.cluster
case "cluster":
return r.cluster, r.cluster
case "family":
return r.family, strings.Title(r.family) //nolint:staticcheck
return r.family, titleCase(r.family)
case "namespace":
return r.namespace, r.namespace
case "application":
return r.id, r.name
return r.application, r.application
default:
return r.id, r.name
return r.application, r.application
}
}
// percentageFor is hard-wired to utilisation in the placeholder. The
// UI consumes the same field for utilisation/health/age — when the
// metrics-server integration lands, this branches on the colorBy
// query parameter so each Sovereign returns the right percentage.
func percentageFor(r appFixture) float64 {
return r.utilizPct
// titleCase upper-cases the first letter without using the deprecated
// strings.Title helper. ASCII-only — every family slug in the catalyst
// registry is ASCII (spine/pilot/fabric/cortex/observability/security).
func titleCase(s string) string {
if s == "" {
return s
}
if s[0] >= 'a' && s[0] <= 'z' {
return string(s[0]-('a'-'A')) + s[1:]
}
return s
}
func sizeValueFor(r appFixture, sizeBy string) float64 {
switch sizeBy {
case "cpu_limit":
return r.cpuLimit
case "memory_limit":
return r.memLimit
case "storage_limit":
return r.storage
case "replica_count":
return float64(r.replicas)
default:
return r.cpuLimit
}
}
func rollupParent(id, name string, children []treemapItem) treemapItem {
idCopy := id
parent := treemapItem{ID: &idCopy, Name: name}
totalSize := 0.0
for _, c := range children {
parent.Count += c.Count
totalSize += c.SizeValue
}
if totalSize > 0 {
weighted := 0.0
for _, c := range children {
weighted += c.Percentage * c.SizeValue
func sumSize(rows []podRow, sizeBy string) float64 {
total := 0.0
for _, r := range rows {
switch sizeBy {
case "cpu_limit":
total += r.cpuLim
case "memory_limit":
total += r.memLim
case "storage_limit":
total += r.storageLim
case "replica_count":
if r.isReady {
total += 1
}
default:
total += r.cpuLim
}
parent.Percentage = weighted / totalSize
}
parent.SizeValue = totalSize
return parent
return total
}
// countContribution mirrors `replica_count` semantics for the cell's
// `count` field — but every other size_by uses pod count so the
// tooltip's "Items: N" reads naturally regardless of size selector.
func countContribution(rows []podRow, sizeBy string) int {
if sizeBy == "replica_count" {
n := 0
for _, r := range rows {
if r.isReady {
n++
}
}
return n
}
return len(rows)
}
// computePercentage returns the cell's color-driving percentage for
// the requested colorBy. Returns nil when the data source is not
// available (only utilization without metrics-server today). The UI
// renders nil-percentage cells as grey.
func computePercentage(rows []podRow, colorBy string) *float64 {
switch colorBy {
case "health":
if len(rows) == 0 {
return nil
}
ready := 0
for _, r := range rows {
if r.isReady {
ready++
}
}
v := 100.0 * float64(ready) / float64(len(rows))
return &v
case "age":
if len(rows) == 0 {
return nil
}
var minCreated time.Time
for _, r := range rows {
if r.createdAt.IsZero() {
continue
}
if minCreated.IsZero() || r.createdAt.Before(minCreated) {
minCreated = r.createdAt
}
}
if minCreated.IsZero() {
return nil
}
days := time.Since(minCreated).Hours() / 24.0
v := (days / AgeNormaliseDays) * 100.0
if v < 0 {
v = 0
}
if v > 100 {
v = 100
}
return &v
case "utilization":
// Σ usage / Σ limit across rows that reported metrics. When NO
// row has metrics, return nil → null in JSON → grey cell.
var sumUsage, sumLimit float64
anyMetrics := false
for _, r := range rows {
if !r.hasMetrics {
continue
}
anyMetrics = true
// Use cpu by default; the UI treemap is single-resource per
// request, so cpu/memory utilisation tracks size_by tightly
// enough at this level. A future split into separate
// cpu_utilization / memory_utilization color metrics would
// branch here.
sumUsage += r.cpuUsage
sumLimit += r.cpuLim
}
if !anyMetrics || sumLimit == 0 {
return nil
}
v := 100.0 * sumUsage / sumLimit
if v < 0 {
v = 0
}
if v > 100 {
v = 100
}
return &v
default:
return nil
}
}
func leafCount(items []treemapItem) int {
@ -354,3 +571,10 @@ func leafCount(items []treemapItem) int {
}
return n
}
// k8sCacheHasCluster — k8s.go owns the full method on Handler; this
// dashboard-side wrapper ensures we never crash on a nil k8sCache when
// the catalyst-api boots without a watcher (test/CI).
//
// (The real implementation lives in k8s.go to avoid duplicate-symbol
// errors during test linking — this comment documents the contract.)

View File

@ -1,50 +1,266 @@
// dashboard_test.go — coverage for the Sovereign Dashboard treemap
// endpoint. The handler emits placeholder data (see dashboard.go header
// for the metrics-server upgrade plan); these tests pin the HTTP shape
// the UI consumes so a future refactor of the data path can't silently
// break the wire contract.
// endpoint.
//
// These tests pin both:
//
// 1. The HTTP shape the UI consumes (group_by validation, color_by,
// size_by, Percentage encoding).
// 2. The end-to-end cache→aggregation path: a fake k8scache.Factory
// seeded with a handful of unstructured Pods + PVCs (+ optional
// PodMetrics) is wired into the Handler, then the handler is
// exercised across every group_by × color_by × size_by combo.
package handler
import (
"context"
"encoding/json"
"io"
"log/slog"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
dynamicfake "k8s.io/client-go/dynamic/fake"
kfake "k8s.io/client-go/kubernetes/fake"
"github.com/openova-io/openova/products/catalyst/bootstrap/api/internal/k8scache"
)
func TestDashboardTreemap_DefaultsAndShape(t *testing.T) {
h := NewWithPDM(silentLogger(), &fakePDM{})
req := httptest.NewRequest(http.MethodGet, "/api/v1/dashboard/treemap", nil)
rec := httptest.NewRecorder()
h.GetDashboardTreemap(rec, req)
if rec.Code != http.StatusOK {
t.Fatalf("status: got %d want 200; body=%s", rec.Code, rec.Body.String())
}
var out treemapResponse
if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
t.Fatalf("decode: %v", err)
}
if len(out.Items) == 0 {
t.Fatalf("expected non-empty items[]")
}
if out.TotalCount <= 0 {
t.Fatalf("expected total_count > 0, got %d", out.TotalCount)
}
// Single-layer call → flat list (no children populated).
for _, it := range out.Items {
if len(it.Children) != 0 {
t.Fatalf("single-layer call returned a parent with children: %+v", it)
}
}
func quietHandlerLogger() *slog.Logger {
return slog.New(slog.NewJSONHandler(io.Discard, &slog.HandlerOptions{Level: slog.LevelError}))
}
func TestDashboardTreemap_NestedTwoLayers(t *testing.T) {
h := NewWithPDM(silentLogger(), &fakePDM{})
req := httptest.NewRequest(http.MethodGet,
"/api/v1/dashboard/treemap?group_by=family,application&color_by=utilization&size_by=cpu_limit",
nil,
)
// dashboardFixturePod produces an unstructured Pod with the labels +
// resource limits the dashboard aggregations read.
type dashFixturePod struct {
Namespace string
Name string
Application string
Family string
CPULimit string
MemLimit string
Ready bool
Created time.Time
PVCs []string
}
func mkDashPod(p dashFixturePod) *unstructured.Unstructured {
containers := []any{
map[string]any{
"name": "main",
"image": "ghcr.io/openova-io/test:1",
"resources": map[string]any{
"limits": map[string]any{
"cpu": p.CPULimit,
"memory": p.MemLimit,
},
},
},
}
volumes := make([]any, 0, len(p.PVCs))
for _, pvc := range p.PVCs {
volumes = append(volumes, map[string]any{
"name": "v-" + pvc,
"persistentVolumeClaim": map[string]any{
"claimName": pvc,
},
})
}
readyStatus := "False"
if p.Ready {
readyStatus = "True"
}
created := p.Created
if created.IsZero() {
created = time.Now().Add(-1 * time.Hour)
}
return &unstructured.Unstructured{Object: map[string]any{
"apiVersion": "v1",
"kind": "Pod",
"metadata": map[string]any{
"namespace": p.Namespace,
"name": p.Name,
"creationTimestamp": created.UTC().Format(time.RFC3339),
"resourceVersion": "1",
"labels": map[string]any{
"app.kubernetes.io/instance": p.Application,
"catalyst.openova.io/family": p.Family,
},
},
"spec": map[string]any{
"containers": containers,
"volumes": volumes,
},
"status": map[string]any{
"conditions": []any{
map[string]any{"type": "Ready", "status": readyStatus},
},
},
}}
}
func mkDashPVC(ns, name, storage string) *unstructured.Unstructured {
return &unstructured.Unstructured{Object: map[string]any{
"apiVersion": "v1",
"kind": "PersistentVolumeClaim",
"metadata": map[string]any{
"namespace": ns,
"name": name,
"resourceVersion": "1",
},
"spec": map[string]any{
"resources": map[string]any{
"requests": map[string]any{
"storage": storage,
},
},
},
}}
}
// mkDashPodMetrics emits a metrics.k8s.io/v1beta1 PodMetrics for a
// single-container pod with the given cpu usage (millicores).
func mkDashPodMetrics(ns, name, cpuUsage string) *unstructured.Unstructured {
return &unstructured.Unstructured{Object: map[string]any{
"apiVersion": "metrics.k8s.io/v1beta1",
"kind": "PodMetrics",
"metadata": map[string]any{
"namespace": ns,
"name": name,
"resourceVersion": "1",
},
"containers": []any{
map[string]any{
"name": "main",
"usage": map[string]any{
"cpu": cpuUsage,
"memory": "0",
},
},
},
}}
}
// dashFixtureScheme + listKinds describes the fake dynamic client the
// k8scache.Factory talks to during tests.
func dashFixtureClients(objs ...runtime.Object) (*dynamicfake.FakeDynamicClient, *kfake.Clientset) {
scheme := runtime.NewScheme()
gvks := []struct {
gvk schema.GroupVersionKind
}{
{schema.GroupVersionKind{Version: "v1", Kind: "Pod"}},
{schema.GroupVersionKind{Version: "v1", Kind: "PodList"}},
{schema.GroupVersionKind{Version: "v1", Kind: "PersistentVolumeClaim"}},
{schema.GroupVersionKind{Version: "v1", Kind: "PersistentVolumeClaimList"}},
{schema.GroupVersionKind{Group: "metrics.k8s.io", Version: "v1beta1", Kind: "PodMetrics"}},
{schema.GroupVersionKind{Group: "metrics.k8s.io", Version: "v1beta1", Kind: "PodMetricsList"}},
}
for _, g := range gvks {
if strings.HasSuffix(g.gvk.Kind, "List") {
scheme.AddKnownTypeWithName(g.gvk, &unstructured.UnstructuredList{})
} else {
scheme.AddKnownTypeWithName(g.gvk, &unstructured.Unstructured{})
}
}
gvrToListKind := map[schema.GroupVersionResource]string{
{Version: "v1", Resource: "pods"}: "PodList",
{Version: "v1", Resource: "persistentvolumeclaims"}: "PersistentVolumeClaimList",
{Group: "metrics.k8s.io", Version: "v1beta1", Resource: "pods"}: "PodMetricsList",
}
dyn := dynamicfake.NewSimpleDynamicClientWithCustomListKinds(scheme, gvrToListKind, objs...)
core := kfake.NewSimpleClientset()
return dyn, core
}
// newDashHandlerWithCache wires a Handler with a started k8scache
// factory containing a single test cluster id. Pass `withMetrics=true`
// to register PodMetrics on the cluster's discovery surface (covered
// by the k8scache_test.go side; dashboard tests focus on the
// no-metrics path which is the correct default).
func newDashHandlerWithCache(t *testing.T, clusterID string, withMetrics bool, objs ...*unstructured.Unstructured) *Handler {
t.Helper()
if withMetrics {
t.Skipf("metrics-server present-path is exercised by k8scache_test; dashboard tests focus on the absent-path null-percentage contract")
}
rtObjs := make([]runtime.Object, 0, len(objs))
for _, o := range objs {
rtObjs = append(rtObjs, o)
}
dyn, core := dashFixtureClients(rtObjs...)
// Minimal registry — the dashboard handler only reads pod, PVC,
// and (optionally) podmetrics. A full DefaultKinds registry would
// require every GVR to be wired into the fake scheme.
r := k8scache.NewRegistry()
_ = r.Add(k8scache.Kind{
Name: "pod",
GVR: schema.GroupVersionResource{Version: "v1", Resource: "pods"},
Namespaced: true,
})
_ = r.Add(k8scache.Kind{
Name: "persistentvolumeclaim",
GVR: schema.GroupVersionResource{Version: "v1", Resource: "persistentvolumeclaims"},
Namespaced: true,
})
_ = r.Add(k8scache.Kind{
Name: "podmetrics",
GVR: schema.GroupVersionResource{Group: "metrics.k8s.io", Version: "v1beta1", Resource: "pods"},
Namespaced: true,
Optional: true,
})
cfg := k8scache.Config{
Logger: quietHandlerLogger(),
Registry: r,
Clusters: []k8scache.ClusterRef{
{ID: clusterID, DynamicClient: dyn, CoreClient: core},
},
}
f, err := k8scache.NewFactory(cfg)
if err != nil {
t.Fatalf("NewFactory: %v", err)
}
if err := f.Start(context.Background()); err != nil {
t.Fatalf("Start: %v", err)
}
t.Cleanup(f.Stop)
// Wait for informers to populate the indexer with the seeded
// objects. Listing returns 0 items both when the informer hasn't
// synced AND when there are genuinely no objects, so we count
// expected pods/pvcs upfront and poll until the indexer matches.
wantPods := 0
wantPVCs := 0
for _, o := range objs {
switch o.GetKind() {
case "Pod":
wantPods++
case "PersistentVolumeClaim":
wantPVCs++
}
}
deadline := time.Now().Add(2 * time.Second)
for time.Now().Before(deadline) {
gotPods, _, _ := f.List(clusterID, "pod", labels.Everything())
gotPVCs, _, _ := f.List(clusterID, "persistentvolumeclaim", labels.Everything())
if len(gotPods) >= wantPods && len(gotPVCs) >= wantPVCs {
break
}
time.Sleep(20 * time.Millisecond)
}
h := NewWithPDM(quietHandlerLogger(), &fakePDM{})
h.SetK8sCache(f, k8scache.NewSARCache(), "X-Forwarded-User")
return h
}
func dashGet(t *testing.T, h *Handler, qs string) treemapResponse {
t.Helper()
req := httptest.NewRequest(http.MethodGet, "/api/v1/dashboard/treemap?"+qs, nil)
rec := httptest.NewRecorder()
h.GetDashboardTreemap(rec, req)
if rec.Code != http.StatusOK {
@ -54,22 +270,13 @@ func TestDashboardTreemap_NestedTwoLayers(t *testing.T) {
if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
t.Fatalf("decode: %v", err)
}
if len(out.Items) == 0 {
t.Fatalf("expected at least one parent group")
}
parentsWithChildren := 0
for _, p := range out.Items {
if len(p.Children) > 0 {
parentsWithChildren++
}
}
if parentsWithChildren == 0 {
t.Fatalf("expected at least one parent with children, got 0")
}
return out
}
/* ── Validation tests (no cache wired) ─────────────────────────── */
func TestDashboardTreemap_RejectsUnknownDimension(t *testing.T) {
h := NewWithPDM(silentLogger(), &fakePDM{})
h := NewWithPDM(quietHandlerLogger(), &fakePDM{})
req := httptest.NewRequest(http.MethodGet,
"/api/v1/dashboard/treemap?group_by=widget", nil)
rec := httptest.NewRecorder()
@ -83,44 +290,185 @@ func TestDashboardTreemap_RejectsUnknownDimension(t *testing.T) {
}
func TestDashboardTreemap_RejectsUnknownColorBy(t *testing.T) {
h := NewWithPDM(silentLogger(), &fakePDM{})
h := NewWithPDM(quietHandlerLogger(), &fakePDM{})
req := httptest.NewRequest(http.MethodGet,
"/api/v1/dashboard/treemap?color_by=mood", nil)
rec := httptest.NewRecorder()
h.GetDashboardTreemap(rec, req)
if rec.Code != http.StatusBadRequest {
t.Fatalf("status: got %d want 400; body=%s", rec.Code, rec.Body.String())
t.Fatalf("status: got %d want 400", rec.Code)
}
}
func TestDashboardTreemap_RejectsUnknownSizeBy(t *testing.T) {
h := NewWithPDM(silentLogger(), &fakePDM{})
h := NewWithPDM(quietHandlerLogger(), &fakePDM{})
req := httptest.NewRequest(http.MethodGet,
"/api/v1/dashboard/treemap?size_by=carbohydrates", nil)
rec := httptest.NewRecorder()
h.GetDashboardTreemap(rec, req)
if rec.Code != http.StatusBadRequest {
t.Fatalf("status: got %d want 400; body=%s", rec.Code, rec.Body.String())
t.Fatalf("status: got %d want 400", rec.Code)
}
}
func TestDashboardTreemap_PercentageInRange(t *testing.T) {
h := NewWithPDM(silentLogger(), &fakePDM{})
req := httptest.NewRequest(http.MethodGet,
"/api/v1/dashboard/treemap?group_by=family,application", nil)
rec := httptest.NewRecorder()
h.GetDashboardTreemap(rec, req)
var out treemapResponse
if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
t.Fatalf("decode: %v", err)
// No cache → empty well-shaped response. Was the old fixture path's
// behaviour to return a 30-cell tree; now the contract is "empty when
// no live data" so tests reflect that.
func TestDashboardTreemap_NoCacheEmpty(t *testing.T) {
h := NewWithPDM(quietHandlerLogger(), &fakePDM{})
out := dashGet(t, h, "group_by=application")
if len(out.Items) != 0 {
t.Fatalf("expected empty items[] when cache absent; got %+v", out)
}
if out.TotalCount != 0 {
t.Fatalf("expected total_count=0; got %d", out.TotalCount)
}
}
// Wrong deployment_id → empty.
func TestDashboardTreemap_UnknownDeploymentEmpty(t *testing.T) {
h := newDashHandlerWithCache(t, "alpha", false,
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p1", Application: "bp-cilium", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: true}),
)
out := dashGet(t, h, "deployment_id=does-not-exist&group_by=application")
if len(out.Items) != 0 {
t.Fatalf("expected empty for unknown deployment_id; got %+v", out)
}
}
/* ── Aggregation tests (cache wired) ───────────────────────────── */
// TestDashboardTreemap_GroupByApplication_CPULimit verifies that
// pods with the same app.kubernetes.io/instance label collapse into
// one cell whose size_value is the sum of CPU limits.
func TestDashboardTreemap_GroupByApplication_CPULimit(t *testing.T) {
h := newDashHandlerWithCache(t, "alpha", false,
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p1", Application: "bp-cilium", Family: "spine", CPULimit: "200m", MemLimit: "256Mi", Ready: true}),
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p2", Application: "bp-cilium", Family: "spine", CPULimit: "300m", MemLimit: "256Mi", Ready: true}),
mkDashPod(dashFixturePod{Namespace: "ns2", Name: "p3", Application: "bp-keycloak", Family: "pilot", CPULimit: "1", MemLimit: "1Gi", Ready: true}),
)
out := dashGet(t, h, "deployment_id=alpha&group_by=application&color_by=health&size_by=cpu_limit")
if len(out.Items) != 2 {
t.Fatalf("expected 2 application buckets; got %d (%+v)", len(out.Items), out)
}
bySize := map[string]float64{}
for _, it := range out.Items {
bySize[it.Name] = it.SizeValue
}
if bySize["bp-cilium"] != 500 {
t.Errorf("bp-cilium cpu_limit: got %v want 500m", bySize["bp-cilium"])
}
if bySize["bp-keycloak"] != 1000 {
t.Errorf("bp-keycloak cpu_limit: got %v want 1000m", bySize["bp-keycloak"])
}
}
// TestDashboardTreemap_HealthColor — color_by=health emits a real
// percentage (Σ Ready / total). Two of three pods Ready → 66.67%.
func TestDashboardTreemap_HealthColor(t *testing.T) {
h := newDashHandlerWithCache(t, "alpha", false,
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p1", Application: "bp-app", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: true}),
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p2", Application: "bp-app", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: true}),
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p3", Application: "bp-app", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: false}),
)
out := dashGet(t, h, "deployment_id=alpha&group_by=application&color_by=health&size_by=cpu_limit")
if len(out.Items) != 1 {
t.Fatalf("expected 1 bucket; got %d", len(out.Items))
}
if out.Items[0].Percentage == nil {
t.Fatalf("health percentage must not be nil for cache-only data")
}
got := *out.Items[0].Percentage
want := 100.0 * 2.0 / 3.0
if got < want-0.5 || got > want+0.5 {
t.Errorf("health pct: got %v want ~%v", got, want)
}
}
// TestDashboardTreemap_UtilizationNullWhenNoMetrics — no PodMetrics
// in the cache → percentage encodes JSON null.
func TestDashboardTreemap_UtilizationNullWhenNoMetrics(t *testing.T) {
h := newDashHandlerWithCache(t, "alpha", false,
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p1", Application: "bp-app", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: true}),
)
out := dashGet(t, h, "deployment_id=alpha&group_by=application&color_by=utilization&size_by=cpu_limit")
if len(out.Items) != 1 {
t.Fatalf("expected 1 bucket; got %d", len(out.Items))
}
if out.Items[0].Percentage != nil {
t.Errorf("expected nil percentage when metrics-server absent; got %v", *out.Items[0].Percentage)
}
}
// TestDashboardTreemap_NestedFamilyApplication — two layers nest, the
// parent's size is the sum of its children's sizes.
func TestDashboardTreemap_NestedFamilyApplication(t *testing.T) {
h := newDashHandlerWithCache(t, "alpha", false,
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p1", Application: "bp-cilium", Family: "spine", CPULimit: "200m", MemLimit: "64Mi", Ready: true}),
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p2", Application: "bp-flux", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: true}),
mkDashPod(dashFixturePod{Namespace: "ns2", Name: "p3", Application: "bp-keycloak", Family: "pilot", CPULimit: "1", MemLimit: "64Mi", Ready: true}),
)
out := dashGet(t, h, "deployment_id=alpha&group_by=family,application&color_by=health&size_by=cpu_limit")
if len(out.Items) != 2 {
t.Fatalf("expected 2 family buckets; got %d", len(out.Items))
}
parents := map[string]treemapItem{}
for _, p := range out.Items {
if p.Percentage < 0 || p.Percentage > 100 {
t.Fatalf("parent %s percentage out of range: %f", p.Name, p.Percentage)
parents[p.Name] = p
}
spine := parents["Spine"]
if len(spine.Children) != 2 {
t.Errorf("spine children: got %d want 2", len(spine.Children))
}
if spine.SizeValue != 300 {
t.Errorf("spine size: got %v want 300m", spine.SizeValue)
}
pilot := parents["Pilot"]
if pilot.SizeValue != 1000 {
t.Errorf("pilot size: got %v want 1000m", pilot.SizeValue)
}
}
// TestDashboardTreemap_StorageLimitFromPVCs — size_by=storage_limit
// sums PVC.spec.resources.requests.storage of every PVC referenced by
// pods in the bucket.
func TestDashboardTreemap_StorageLimitFromPVCs(t *testing.T) {
h := newDashHandlerWithCache(t, "alpha", false,
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p1", Application: "bp-app", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: true, PVCs: []string{"data-0"}}),
mkDashPVC("ns1", "data-0", "1Gi"),
)
out := dashGet(t, h, "deployment_id=alpha&group_by=application&color_by=health&size_by=storage_limit")
if len(out.Items) != 1 {
t.Fatalf("expected 1 bucket; got %d", len(out.Items))
}
want := float64(1 * 1024 * 1024 * 1024)
if out.Items[0].SizeValue != want {
t.Errorf("storage_limit: got %v want %v", out.Items[0].SizeValue, want)
}
}
// TestDashboardTreemap_PercentageInRange — guard that no bucket
// produces an out-of-range percentage. Uses the health metric so we
// always get a non-nil percentage.
func TestDashboardTreemap_PercentageInRange(t *testing.T) {
h := newDashHandlerWithCache(t, "alpha", false,
mkDashPod(dashFixturePod{Namespace: "ns1", Name: "p1", Application: "bp-a", Family: "spine", CPULimit: "100m", MemLimit: "64Mi", Ready: true}),
mkDashPod(dashFixturePod{Namespace: "ns2", Name: "p2", Application: "bp-b", Family: "pilot", CPULimit: "100m", MemLimit: "64Mi", Ready: false}),
)
out := dashGet(t, h, "deployment_id=alpha&group_by=family,application&color_by=health&size_by=cpu_limit")
for _, p := range out.Items {
if p.Percentage == nil {
continue
}
if *p.Percentage < 0 || *p.Percentage > 100 {
t.Fatalf("parent %s percentage out of range: %f", p.Name, *p.Percentage)
}
for _, c := range p.Children {
if c.Percentage < 0 || c.Percentage > 100 {
t.Fatalf("child %s percentage out of range: %f", c.Name, c.Percentage)
if c.Percentage == nil {
continue
}
if *c.Percentage < 0 || *c.Percentage > 100 {
t.Fatalf("child %s percentage out of range: %f", c.Name, *c.Percentage)
}
}
}

View File

@ -439,7 +439,42 @@ func (f *Factory) AddCluster(c ClusterRef) error {
synced: map[string]bool{},
lastEventAt: map[string]time.Time{},
}
// Build the per-cluster availability set for Optional kinds. The
// probe is one ServerResourcesForGroupVersion call per distinct
// (group,version), cached by string key. When discovery fails we
// log + skip the optional kind entirely (treat as absent) — better
// to render a degraded UI than crash-loop the informer.
optionalAvail := map[string]bool{}
if core != nil {
seenGV := map[string]bool{}
for _, k := range f.registry.All() {
if !k.Optional {
continue
}
gv := k.GVR.GroupVersion().String()
if seenGV[gv] {
continue
}
seenGV[gv] = true
list, derr := core.Discovery().ServerResourcesForGroupVersion(gv)
if derr != nil || list == nil {
f.log.Info("k8scache: optional GroupVersion absent on cluster",
"cluster", c.ID, "gv", gv, "err", derr)
continue
}
for _, r := range list.APIResources {
optionalAvail[k.GVR.GroupVersion().WithResource(r.Name).String()] = true
}
}
}
for _, k := range f.registry.All() {
if k.Optional && !optionalAvail[k.GVR.String()] {
f.log.Info("k8scache: skipping optional kind on cluster (GVR not discoverable)",
"cluster", c.ID, "kind", k.Name, "gvr", k.GVR.String())
continue
}
inf := cs.factory.ForResource(k.GVR).Informer()
k := k // capture per-iteration
_, err := inf.AddEventHandler(cache.ResourceEventHandlerFuncs{

View File

@ -31,6 +31,7 @@ import (
dynamicfake "k8s.io/client-go/dynamic/fake"
clientgokubernetes "k8s.io/client-go/kubernetes"
kfake "k8s.io/client-go/kubernetes/fake"
fakediscovery "k8s.io/client-go/discovery/fake"
)
// quietLogger discards log output so test runs aren't noisy.
@ -97,6 +98,48 @@ func TestRegistry_AddRequiresResource(t *testing.T) {
}
}
// TestDefaultKinds_GraphAndDashboardSurface asserts the kinds the
// architecture-graph adapter and dashboard treemap depend on are part
// of the default registry. A regression here would silently break the
// /cloud?view=graph K8s-workload projection or the /dashboard live
// aggregation, so it's pinned by name.
func TestDefaultKinds_GraphAndDashboardSurface(t *testing.T) {
r := NewRegistry()
for _, k := range DefaultKinds {
_ = r.Add(k)
}
mandatory := []string{
// existing
"namespace", "node", "pod", "service", "configmap", "secret",
"persistentvolumeclaim", "deployment", "statefulset", "daemonset",
"ingress",
// new — graph + dashboard depend on these
"persistentvolume", "replicaset", "endpointslice",
// optional but registered
"podmetrics",
}
for _, name := range mandatory {
if _, ok := r.Get(name); !ok {
t.Errorf("DefaultKinds missing %q — required by architecture-graph or dashboard", name)
}
}
// PodMetrics MUST be flagged Optional so the discovery probe in
// AddCluster skips it on Sovereigns without metrics-server.
if pm, ok := r.Get("podmetrics"); !ok {
t.Fatalf("podmetrics not in registry")
} else if !pm.Optional {
t.Errorf("podmetrics must be Optional=true; got false")
}
// All other kinds must be mandatory — Optional is reserved for
// add-ons we know are not part of in-spec K8s.
for _, k := range DefaultKinds {
if k.Name != "podmetrics" && k.Optional {
t.Errorf("kind %q should be mandatory; got Optional=true", k.Name)
}
}
}
func TestRegistry_AllAndNames(t *testing.T) {
r := NewRegistry()
for _, k := range DefaultKinds {
@ -198,6 +241,94 @@ func TestFactory_ListUnknownClusterErrors(t *testing.T) {
}
}
// TestFactory_OptionalKindSkippedWhenAbsent — adding a Kind flagged
// Optional whose GVR is not in the cluster's discovery surface MUST
// not crash-loop the informer. The factory probes discovery, sees
// the GroupVersion is unregistered, and silently skips the informer.
// Listing that kind on the cluster then errors out cleanly.
func TestFactory_OptionalKindSkippedWhenAbsent(t *testing.T) {
dyn, core := fakeClients()
r := minimalRegistry()
// metrics.k8s.io is NOT registered on the fake clientset's discovery,
// so this Optional kind must be skipped at AddCluster time.
_ = r.Add(Kind{
Name: "podmetrics",
GVR: schema.GroupVersionResource{Group: "metrics.k8s.io", Version: "v1beta1", Resource: "pods"},
Namespaced: true,
Optional: true,
})
cfg := Config{
Logger: quietLogger(),
Registry: r,
Clusters: []ClusterRef{
{ID: "alpha", DynamicClient: dyn, CoreClient: core},
},
}
f, err := NewFactory(cfg)
if err != nil {
t.Fatalf("NewFactory: %v", err)
}
defer f.Stop()
if err := f.Start(context.Background()); err != nil {
t.Fatalf("Start: %v", err)
}
_, _, err = f.List("alpha", "podmetrics", labels.Everything())
if err == nil {
t.Fatalf("expected error listing optional+absent kind, got nil")
}
}
// TestFactory_OptionalKindRegisteredWhenPresent — when discovery does
// surface the Optional kind's GroupVersion, the informer spawns
// normally and List works. We seed metrics.k8s.io into the typed
// fake's Discovery via the kfake.Resources field.
func TestFactory_OptionalKindRegisteredWhenPresent(t *testing.T) {
dyn, _ := fakeClients()
core := kfake.NewSimpleClientset()
// kfake exposes a fake discovery client; populate it with the
// metrics.k8s.io/v1beta1 resource list so AddCluster's probe
// returns success.
if fd, ok := core.Discovery().(*fakediscovery.FakeDiscovery); ok {
fd.Resources = append(fd.Resources, &metav1.APIResourceList{
GroupVersion: "metrics.k8s.io/v1beta1",
APIResources: []metav1.APIResource{
{Name: "pods", Namespaced: true, Kind: "PodMetrics"},
},
})
} else {
t.Skipf("fake discovery not assertable; skipping")
}
r := minimalRegistry()
_ = r.Add(Kind{
Name: "podmetrics",
GVR: schema.GroupVersionResource{Group: "metrics.k8s.io", Version: "v1beta1", Resource: "pods"},
Namespaced: true,
Optional: true,
})
cfg := Config{
Logger: quietLogger(),
Registry: r,
Clusters: []ClusterRef{
{ID: "alpha", DynamicClient: dyn, CoreClient: core},
},
}
f, err := NewFactory(cfg)
if err != nil {
t.Fatalf("NewFactory: %v", err)
}
defer f.Stop()
if err := f.Start(context.Background()); err != nil {
t.Fatalf("Start: %v", err)
}
// List of zero items is fine; the contract under test is that
// the informer was spawned (no "kind not registered" error).
_, _, err = f.List("alpha", "podmetrics", labels.Everything())
if err != nil {
t.Fatalf("expected optional+present kind to list cleanly, got %v", err)
}
}
func TestFactory_SubscribeReceivesEvents(t *testing.T) {
dyn, core := fakeClients() // empty initial state
cfg := Config{

View File

@ -60,6 +60,16 @@ type Kind struct {
// Secret. ConfigMap data is treated as PII-adjacent and also
// stripped (see redactObject).
Sensitive bool
// Optional — true when the GVR is provided by an add-on that may
// be absent from a given Sovereign (today: metrics.k8s.io served
// by the optional metrics-server). The factory probes discovery
// at AddCluster time and only spawns an informer for the cluster
// when the GVR is registered. Mandatory kinds (Optional=false)
// always get an informer; if the watch fails the informer retries
// — that path is reserved for kinds we know are part of any
// in-spec K8s distro (core/v1, apps/v1, networking.k8s.io/v1).
Optional bool
}
// DefaultKinds is the built-in registry — every Sovereign starts with
@ -81,14 +91,23 @@ var DefaultKinds = []Kind{
{Name: "configmap", GVR: schema.GroupVersionResource{Group: "", Version: "v1", Resource: "configmaps"}, Namespaced: true, Sensitive: true},
{Name: "secret", GVR: schema.GroupVersionResource{Group: "", Version: "v1", Resource: "secrets"}, Namespaced: true, Sensitive: true},
{Name: "persistentvolumeclaim", GVR: schema.GroupVersionResource{Group: "", Version: "v1", Resource: "persistentvolumeclaims"}, Namespaced: true},
// PV is cluster-scoped; needed by the architecture-graph PVC→Volume.hcloud
// bridge (PV.csi.volumeAttributes carries the hcloud volume id).
{Name: "persistentvolume", GVR: schema.GroupVersionResource{Group: "", Version: "v1", Resource: "persistentvolumes"}, Namespaced: false},
// Workloads (apps/v1).
{Name: "deployment", GVR: schema.GroupVersionResource{Group: "apps", Version: "v1", Resource: "deployments"}, Namespaced: true},
{Name: "statefulset", GVR: schema.GroupVersionResource{Group: "apps", Version: "v1", Resource: "statefulsets"}, Namespaced: true},
{Name: "daemonset", GVR: schema.GroupVersionResource{Group: "apps", Version: "v1", Resource: "daemonsets"}, Namespaced: true},
// ReplicaSet — intermediate ownerRef hop on the Deployment→Pod chain.
// The graph adapter chases this hop to attribute Pods to their Deployment.
{Name: "replicaset", GVR: schema.GroupVersionResource{Group: "apps", Version: "v1", Resource: "replicasets"}, Namespaced: true},
// Networking (networking.k8s.io/v1).
{Name: "ingress", GVR: schema.GroupVersionResource{Group: "networking.k8s.io", Version: "v1", Resource: "ingresses"}, Namespaced: true},
// EndpointSlice — exact Service→Pod membership without recomputing
// label-selector matches client-side for every Service-Pod pair.
{Name: "endpointslice", GVR: schema.GroupVersionResource{Group: "discovery.k8s.io", Version: "v1", Resource: "endpointslices"}, Namespaced: true},
// Crossplane managed resources — provider-hcloud's K8s projection
// of cloud-side objects (ADR-0001 §5: cloud + K8s data are
@ -100,6 +119,13 @@ var DefaultKinds = []Kind{
// vCluster.io tenants.
{Name: "vcluster", GVR: schema.GroupVersionResource{Group: "vcluster.com", Version: "v1alpha1", Resource: "vclusters"}, Namespaced: true},
// metrics-server projection. Optional — only registered on
// Sovereigns where metrics-server has installed the
// metrics.k8s.io APIService. The dashboard handler reads this
// indexer for color_by=utilization; when absent the handler
// returns percentage=null and the UI greys those cells.
{Name: "podmetrics", GVR: schema.GroupVersionResource{Group: "metrics.k8s.io", Version: "v1beta1", Resource: "pods"}, Namespaced: true, Optional: true},
}
// Registry is a runtime-mutable lookup keyed by the short Name. It

View File

@ -627,24 +627,24 @@ const marketplaceProductRoute = createRoute({
* Sovereigns by refactoring the canonical components to read deploymentId
* from a route-aware hook (useResolvedDeploymentId, already added).
*/
/**
* Sovereign Console layout mounted at the root path on Sovereign clusters
* so operator pages live at clean URLs (`/dashboard`, `/apps`, `/jobs`,
* `/cloud`, `/users`, `/settings`, `/parent-domains`, `/catalog`). On
* contabo the same component renders at `/sovereign/<page>` but the
* mothership wizard tracks per-deployment state at `/sovereign/provision/$id/*`
* (the transient URL pattern that's only meaningful while monitoring
* a specific provisioning run from the wizard shell).
*/
/**
* Pathless layout route inherits the parent URL (root) and only adds
* the SovereignConsoleLayout chrome. Children live at clean root paths.
*/
const consoleLayoutRoute = createRoute({
getParentRoute: () => rootRoute,
path: '/console',
id: '_sovereign_console',
component: SovereignConsoleLayout,
})
// /console/* mounts the SAME canonical components as /provision/$deploymentId/*
// — the components resolve deploymentId via useResolvedDeploymentId() which
// falls back from URL params to GET /api/v1/sovereign/self when running on
// a Sovereign cluster. Pixel-byte-byte-identical UI, clean URLs.
const consoleIndexRoute = createRoute({
getParentRoute: () => consoleLayoutRoute,
path: '/',
beforeLoad: () => {
throw redirect({ to: '/console/dashboard' as never, replace: true })
},
component: () => null,
})
const consoleDashboardRoute = createRoute({
getParentRoute: () => consoleLayoutRoute,
path: '/dashboard',
@ -813,7 +813,6 @@ const routeTree = rootRoute.addChildren([
marketplaceFamilyRoute,
marketplaceProductRoute,
consoleLayoutRoute.addChildren([
consoleIndexRoute,
consoleDashboardRoute,
consoleAppsRoute,
consoleAppDetailRoute,

View File

@ -0,0 +1,87 @@
/**
* Mode-aware route path builder.
*
* On a Sovereign cluster (`console.<sov-fqdn>`), operator-facing pages
* live at clean root URLs:
*
* /dashboard
* /apps
* /apps/$componentId
* /jobs
* /jobs/$jobId
* /cloud
* /users
* /users/new
* /users/$name
* /settings
* /settings/marketplace
* /catalog
* /parent-domains
*
* On the contabo mothership wizard (`console.openova.io/sovereign/...`),
* the same pages live under the per-deployment transient prefix
* `/provision/$deploymentId/...` because the operator monitors many
* different deployments from one mothership.
*
* Internal `<Link to=...>` and `router.navigate({to:...})` calls MUST go
* through this helper so a single component renders with the correct
* URL on both surfaces. Per docs/INVIOLABLE-PRINCIPLES.md #4 never
* hardcode paths in callers.
*
* The helper accepts both inputs (page name + optional deploymentId)
* and returns the target path. On Sovereign the deploymentId is silently
* ignored the URL stays clean.
*/
import { DETECTED_MODE } from './detectMode'
export type SovereignPage =
| '' // root → /dashboard on sovereign, /provision/$id on contabo
| 'dashboard'
| 'apps'
| 'jobs'
| 'cloud'
| 'users'
| 'settings'
| 'settings/marketplace'
| 'notifications'
| 'parent-domains'
| 'catalog'
interface PathOptions {
/** Required on contabo (mothership). Ignored on Sovereign. */
deploymentId?: string
/** Optional sub-path appended after the page (e.g. a job id). */
sub?: string
}
/**
* Return the route path for an operator-facing page. Mode-aware:
*
* sovereignPath('dashboard') '/dashboard' (sovereign)
* sovereignPath('dashboard', { deploymentId: 'abc123' }) '/provision/abc123/dashboard' (contabo)
* sovereignPath('jobs', { deploymentId, sub: 'jid42' }) '/jobs/jid42' or '/provision/$id/jobs/jid42'
* sovereignPath('apps', { deploymentId, sub: '$cid' }) '/apps/$cid' or '/provision/$id/app/$cid'
* sovereignPath('') '/dashboard' on sovereign, '/provision/$id' on contabo
*/
export function sovereignPath(page: SovereignPage, opts: PathOptions = {}): string {
const { deploymentId = '', sub } = opts
const isSovereign = DETECTED_MODE.mode === 'sovereign'
// Special-case: contabo's /provision/$id root view is the apps page; on
// sovereign the root view redirects to /dashboard.
if (page === '') {
return isSovereign ? '/dashboard' : `/provision/${deploymentId}`
}
// Apps detail: contabo mounts it at /provision/$id/app/$componentId;
// on sovereign it's /apps/$componentId.
if (page === 'apps' && sub) {
return isSovereign ? `/apps/${sub}` : `/provision/${deploymentId}/app/${sub}`
}
if (isSovereign) {
return sub ? `/${page}/${sub}` : `/${page}`
}
return sub ? `/provision/${deploymentId}/${page}/${sub}` : `/provision/${deploymentId}/${page}`
}

View File

@ -11,11 +11,16 @@
import {
IconArrowsSplit,
IconBox,
IconBoxMultiple,
IconBucketDroplet,
IconCircleDot,
IconCloud,
IconCpu,
IconDatabase,
IconDisc,
IconFileText,
IconFolderOpen,
IconLayersDifference,
IconMapPin,
IconNetwork,
IconRouteAltLeft,
@ -40,4 +45,11 @@ export const NODE_ICON: Record<ArchNodeType, Icon> = {
Volume: IconDisc,
Service: IconWorld,
Ingress: IconRouteAltLeft,
// K8s-side
Namespace: IconFolderOpen,
Pod: IconCircleDot,
Deployment: IconBoxMultiple,
StatefulSet: IconLayersDifference,
DaemonSet: IconLayersDifference,
ConfigMap: IconFileText,
}

View File

@ -40,6 +40,17 @@ export type ArchNodeType =
| 'Volume'
| 'Service'
| 'Ingress'
// K8s-side projection (issue #975) — surfaced from the per-Sovereign
// k8scache.Factory's Indexer via /api/v1/sovereigns/{id}/k8s/stream.
// Every K8s node carries a `Pod:<ns>/<name>` style composite id; the
// bridge with the cloud-side `WorkerNode` is a name-or-IP match
// collapsed at adapter merge time.
| 'Namespace'
| 'Pod'
| 'Deployment'
| 'StatefulSet'
| 'DaemonSet'
| 'ConfigMap'
/**
* Canonical ordered list of every type the data layer + chip strip
@ -61,8 +72,26 @@ export const ALL_NODE_TYPES: ArchNodeType[] = [
'Volume',
'Service',
'Ingress',
// K8s-side
'Namespace',
'Pod',
'Deployment',
'StatefulSet',
'DaemonSet',
'ConfigMap',
]
/**
* Default-off node types high-cardinality kinds whose chips start
* unchecked. Operators can enable them at any time. Today: Pod and
* ConfigMap, which can balloon past 200+ nodes on a healthy Sovereign
* and crowd the canvas before any signal emerges.
*/
export const DEFAULT_INACTIVE_TYPES: ReadonlySet<ArchNodeType> = new Set([
'Pod',
'ConfigMap',
])
/**
* Edge relationship types. Containment is just one of these the
* founder spec verbatim: "forget about the containment, just show it
@ -206,6 +235,13 @@ export const NODE_FILL: Record<ArchNodeType, string> = {
Volume: '#22b8cf', // cyan — block storage
Service: '#20c997', // mint — k8s service
Ingress: '#e64980', // pink — k8s ingress
// K8s-side projection
Namespace: '#495057', // dark grey — logical grouping
Pod: '#74c0fc', // light blue — leaf workload
Deployment: '#4dabf7', // sky blue — workload owner
StatefulSet: '#6741d9', // deep violet — stateful workload owner
DaemonSet: '#9c36b5', // magenta — per-node workload owner
ConfigMap: '#adb5bd', // light grey — config payload
}
export const EDGE_STROKE: Record<ArchEdgeType, string> = {