feat(catalyst-api): infrastructure CRUD via Crossplane XRC + unified topology endpoint (#239)
Refactors the Sovereign Infrastructure surface so every Day-2 mutation
flows through a Crossplane Composite Resource Claim (XRC) the catalyst-
api writes against the SOVEREIGN cluster's kubeconfig — never bespoke
hcloud-go calls, never `exec.Command("kubectl", ...)`, never client-go
direct mutation. Per docs/INVIOLABLE-PRINCIPLES.md #3 Crossplane is the
ONLY Day-2 IaC seam.
When the third-sibling chart's Composition for a given XRC kind isn't
present yet, Crossplane stores the claim and sits it as Pending; the
catalyst-api emits a Job log line "Awaiting Crossplane Composition
for <kind>" so an operator browsing /jobs sees the gap. Each mutation
also commits a Job + Execution + LogLines via the existing audit-trail
Bridge so the table view shows every Day-2 action.
Endpoints + XRC kinds (all Composition targets owned by the third-
sibling agent):
GET .../infrastructure/topology unified hierarchical read
POST .../infrastructure/regions RegionClaim region-composition
POST .../infrastructure/regions/{id}/clusters ClusterClaim cluster-composition
POST .../infrastructure/clusters/{id}/vclusters VClusterClaim vcluster-composition
POST .../infrastructure/clusters/{id}/pools NodePoolClaim nodepool-composition
PATCH .../infrastructure/pools/{id} NodePoolClaim nodepool-composition
POST .../infrastructure/loadbalancers LoadBalancerClaim lb-composition
POST .../infrastructure/peerings PeeringClaim peering-composition
POST .../infrastructure/firewalls/{id}/rules FirewallRuleClaim firewall-composition
POST .../infrastructure/nodes/{id}/{cordon|drain|replace}
NodeActionClaim node-action-composition
DELETE .../infrastructure/{kind}/{id} <kind>'s claim <kind>-composition
Response shape per write: 202 Accepted with
{ jobId, xrcKind, xrcName, status: "submitted-pending-composition",
submittedAt, cascade?: [...] }
DELETE additionally returns a Cascade preview (computed from the live
topology) so the FE confirm dialog can render "deleting region X
will drain Y workloads, remove Z PVCs".
Unified topology endpoint emits TopologyResponse: cloud[*] +
topology.regions[*].clusters[*].(vclusters|nodePools|nodes|
loadBalancers) + storage.(pvcs|buckets|volumes). The four FE tabs
(Topology, Compute, Storage, Network) all derive their views off this
single response. Live-source fields fall back to empty arrays — never
placeholder data per the founder's "no synthetic rows" rule. Legacy
flat /compute, /storage, /network endpoints stay wired with their
pre-existing shapes until the FE migrates.
New files:
- internal/infrastructure/types.go wire types
- internal/infrastructure/xrc.go Crossplane writer + DNS-1123 namer
- internal/infrastructure/topology_loader.go composes from tofu outputs
+ informer cache + Crossplane MR list
- internal/jobs/mutation_bridge.go RegisterMutationJob /
AppendXRCSubmittedLog /
FinishMutationJob — every
mutation lands in batch
"day-2-mutations"
Tests (`go test -race`):
- infrastructure_test.go unified TopologyResponse shape +
empty fallback for storage/peerings
+ legacy compute/storage/network
endpoints unchanged
- infrastructure_crud_test.go per-endpoint 202 happy path, 404 unknown
deployment, 409 XRC name conflict, 503
on missing kubeconfig, DELETE cascade
preview, audit-Job materialised in
jobs.Store
CORS now allows PATCH + DELETE so the FE wire calls succeed.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
6cbfbd18ef
commit
948bf61b91
@ -32,7 +32,7 @@ func main() {
|
||||
// from the new VM, not a browser), but enabling PUT here
|
||||
// keeps the policy consistent for any future browser-side
|
||||
// resume flow that re-uses the same endpoint.
|
||||
AllowedMethods: []string{"GET", "POST", "PUT", "OPTIONS"},
|
||||
AllowedMethods: []string{"GET", "POST", "PUT", "PATCH", "DELETE", "OPTIONS"},
|
||||
AllowedHeaders: []string{"Accept", "Content-Type", "Authorization"},
|
||||
MaxAge: 300,
|
||||
}))
|
||||
@ -97,16 +97,34 @@ func main() {
|
||||
// V1 emits a static placeholder shape — see dashboard.go header
|
||||
// for the metrics-server upgrade plan.
|
||||
r.Get("/api/v1/dashboard/treemap", h.GetDashboardTreemap)
|
||||
// Sovereign Infrastructure surface (issue #227) — Topology canvas
|
||||
// + Compute / Storage / Network card grids. Each endpoint reads
|
||||
// from the deployment record + (future) live cluster kubeconfig;
|
||||
// see internal/handler/infrastructure.go for the data-source
|
||||
// contract.
|
||||
// Sovereign Infrastructure surface — unified topology read +
|
||||
// Day-2 CRUD via Crossplane XRC writes (issue #227 + Day-2 IaC).
|
||||
// Read endpoints compose from the deployment record + live
|
||||
// cluster informer cache; mutation endpoints write Composite
|
||||
// Resource Claims to the Sovereign cluster's kubeconfig per
|
||||
// docs/INVIOLABLE-PRINCIPLES.md #3 (Crossplane is the ONLY
|
||||
// Day-2 IaC seam). Every mutation also commits a Job entry to
|
||||
// the existing /jobs surface for full audit-trail.
|
||||
r.Get("/api/v1/deployments/{depId}/infrastructure/topology", h.GetInfrastructureTopology)
|
||||
r.Get("/api/v1/deployments/{depId}/infrastructure/compute", h.GetInfrastructureCompute)
|
||||
r.Get("/api/v1/deployments/{depId}/infrastructure/storage", h.GetInfrastructureStorage)
|
||||
r.Get("/api/v1/deployments/{depId}/infrastructure/network", h.GetInfrastructureNetwork)
|
||||
|
||||
// CRUD — every endpoint writes a Crossplane XRC + a mutation Job.
|
||||
// The third-sibling chart authors the matching Compositions; until
|
||||
// they land Crossplane sits the claim Pending and the catalyst-api
|
||||
// surfaces "Awaiting Composition for <kind>" in the audit log.
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions", h.CreateInfrastructureRegion)
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions/{id}/clusters", h.CreateInfrastructureCluster)
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/clusters/{id}/vclusters", h.CreateInfrastructureVCluster)
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/clusters/{id}/pools", h.CreateInfrastructurePool)
|
||||
r.Patch("/api/v1/deployments/{depId}/infrastructure/pools/{id}", h.PatchInfrastructurePool)
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/loadbalancers", h.CreateInfrastructureLoadBalancer)
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/peerings", h.CreateInfrastructurePeering)
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/firewalls/{id}/rules", h.CreateInfrastructureFirewallRule)
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/nodes/{id}/{action}", h.CreateInfrastructureNodeAction)
|
||||
r.Delete("/api/v1/deployments/{depId}/infrastructure/{kind}/{id}", h.DeleteInfrastructureResource)
|
||||
|
||||
log.Info("catalyst api listening", "port", port)
|
||||
if err := http.ListenAndServe(":"+port, r); err != nil {
|
||||
log.Error("server error", "err", err)
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,592 @@
|
||||
// infrastructure_crud_test.go — coverage for the Day-2 mutation
|
||||
// endpoints (POST/PATCH/DELETE) that write Crossplane XRCs against
|
||||
// the Sovereign cluster's dynamic client.
|
||||
//
|
||||
// The fake dynamic client is seeded with the right list-kinds for
|
||||
// each XRC kind so the catalyst-api's create call returns a typed
|
||||
// success without hitting a real apiserver. Tests assert:
|
||||
//
|
||||
// - 202 happy path: response carries jobId + xrcKind + xrcName + status
|
||||
// - 404 unknown deployment
|
||||
// - 409 conflict when same XRC name already exists
|
||||
// - 503 when the sovereign cluster is unreachable (no kubeconfig)
|
||||
// - DELETE returns the cascade preview
|
||||
package handler
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"encoding/json"
|
||||
"net/http"
|
||||
"net/http/httptest"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"sync"
|
||||
"testing"
|
||||
|
||||
"github.com/go-chi/chi/v5"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
|
||||
"k8s.io/apimachinery/pkg/runtime"
|
||||
"k8s.io/apimachinery/pkg/runtime/schema"
|
||||
"k8s.io/client-go/dynamic"
|
||||
dynamicfake "k8s.io/client-go/dynamic/fake"
|
||||
|
||||
"github.com/openova-io/openova/products/catalyst/bootstrap/api/internal/infrastructure"
|
||||
"github.com/openova-io/openova/products/catalyst/bootstrap/api/internal/jobs"
|
||||
"github.com/openova-io/openova/products/catalyst/bootstrap/api/internal/provisioner"
|
||||
)
|
||||
|
||||
// xrcListKinds — every Composite Resource Claim kind the CRUD
|
||||
// handlers can write. Mirrors the Kind* constants in
|
||||
// internal/infrastructure/xrc.go. Tests register the matching
|
||||
// list-kind names so the fake dynamic client's List+Create paths
|
||||
// behave correctly.
|
||||
func xrcListKinds() map[schema.GroupVersionResource]string {
|
||||
mk := func(plural string) schema.GroupVersionResource {
|
||||
return schema.GroupVersionResource{
|
||||
Group: infrastructure.XRCAPIGroup,
|
||||
Version: infrastructure.XRCAPIVersion,
|
||||
Resource: plural,
|
||||
}
|
||||
}
|
||||
out := map[schema.GroupVersionResource]string{
|
||||
mk("regionclaims"): "RegionClaimList",
|
||||
mk("clusterclaims"): "ClusterClaimList",
|
||||
mk("vclusterclaims"): "VClusterClaimList",
|
||||
mk("nodepoolclaims"): "NodePoolClaimList",
|
||||
mk("loadbalancerclaims"): "LoadBalancerClaimList",
|
||||
mk("peeringclaims"): "PeeringClaimList",
|
||||
mk("firewallruleclaims"): "FirewallRuleClaimList",
|
||||
mk("nodeactionclaims"): "NodeActionClaimList",
|
||||
}
|
||||
// The DELETE handler calls infrastructure.Load to compute the
|
||||
// cascade preview, which queries vcluster.io/v1alpha1/vclusters
|
||||
// and core/v1/persistentvolumeclaims. Register those kinds with
|
||||
// the fake client so List doesn't panic on "unregistered list
|
||||
// kind". Production hits a real apiserver that either has the
|
||||
// CRD or returns 404 — both code paths return gracefully.
|
||||
out[schema.GroupVersionResource{Group: "vcluster.io", Version: "v1alpha1", Resource: "vclusters"}] = "VClusterList"
|
||||
out[schema.GroupVersionResource{Group: "", Version: "v1", Resource: "persistentvolumeclaims"}] = "PersistentVolumeClaimList"
|
||||
return out
|
||||
}
|
||||
|
||||
// fakeXRCDynamicFactory — closure factory the handler reads via
|
||||
// h.dynamicFactory. Returns a single fake client seeded with the
|
||||
// xrcListKinds map; tests can append additional unstructured
|
||||
// objects to simulate pre-existing claims (for the 409 conflict
|
||||
// path).
|
||||
func fakeXRCDynamicFactory(seed ...runtime.Object) func(string) (dynamic.Interface, error) {
|
||||
scheme := runtime.NewScheme()
|
||||
client := dynamicfake.NewSimpleDynamicClientWithCustomListKinds(scheme, xrcListKinds(), seed...)
|
||||
return func(_ string) (dynamic.Interface, error) {
|
||||
return client, nil
|
||||
}
|
||||
}
|
||||
|
||||
// installCRUDDeployment — like installInfraDeployment but with a
|
||||
// Result.KubeconfigPath pointing at a temp file so
|
||||
// sovereignDynamicClient resolves to the injected fake. Each test
|
||||
// gets its own deployment id so concurrent tests don't share a
|
||||
// fake apiserver.
|
||||
func installCRUDDeployment(t *testing.T, h *Handler, id string) *Deployment {
|
||||
t.Helper()
|
||||
path := filepath.Join(t.TempDir(), id+".yaml")
|
||||
// Kubeconfig contents are ignored by the fake factory, but the
|
||||
// file must exist + be readable. Per
|
||||
// docs/INVIOLABLE-PRINCIPLES.md #10 the fake content carries no
|
||||
// real credentials.
|
||||
if err := os.WriteFile(path, []byte("apiVersion: v1\nkind: Config"), 0o600); err != nil {
|
||||
t.Fatalf("write kubeconfig: %v", err)
|
||||
}
|
||||
dep := &Deployment{
|
||||
ID: id,
|
||||
Status: "ready",
|
||||
Request: provisioner.Request{
|
||||
SovereignFQDN: "omantel.omani.works",
|
||||
Region: "fsn1",
|
||||
ControlPlaneSize: "cpx21",
|
||||
WorkerSize: "cpx41",
|
||||
WorkerCount: 2,
|
||||
HetznerProjectID: "test-project",
|
||||
},
|
||||
Result: &provisioner.Result{
|
||||
SovereignFQDN: "omantel.omani.works",
|
||||
ControlPlaneIP: "5.6.7.8",
|
||||
LoadBalancerIP: "203.0.113.10",
|
||||
KubeconfigPath: path,
|
||||
},
|
||||
mu: sync.Mutex{},
|
||||
}
|
||||
h.deployments.Store(id, dep)
|
||||
return dep
|
||||
}
|
||||
|
||||
// callCRUDInfra fires a request through a freshly-built chi router
|
||||
// that knows the depId + nested path params. Returns the recorder.
|
||||
func callCRUDInfra(t *testing.T, h *Handler, method, suffix string, depID string, body any, register func(r chi.Router, h *Handler)) *httptest.ResponseRecorder {
|
||||
t.Helper()
|
||||
r := chi.NewRouter()
|
||||
register(r, h)
|
||||
var buf *bytes.Buffer
|
||||
if body != nil {
|
||||
raw, err := json.Marshal(body)
|
||||
if err != nil {
|
||||
t.Fatalf("marshal: %v", err)
|
||||
}
|
||||
buf = bytes.NewBuffer(raw)
|
||||
} else {
|
||||
buf = bytes.NewBuffer(nil)
|
||||
}
|
||||
req := httptest.NewRequest(method, "/api/v1/deployments/"+depID+"/infrastructure/"+suffix, buf)
|
||||
req.Header.Set("Content-Type", "application/json")
|
||||
rec := httptest.NewRecorder()
|
||||
r.ServeHTTP(rec, req)
|
||||
return rec
|
||||
}
|
||||
|
||||
func mustDecodeMutation(t *testing.T, rec *httptest.ResponseRecorder) infrastructure.MutationResponse {
|
||||
t.Helper()
|
||||
var out infrastructure.MutationResponse
|
||||
if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
|
||||
t.Fatalf("decode mutation: %v body=%s", err, rec.Body.String())
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/regions ────────────────────────────── */
|
||||
|
||||
func TestCreateRegion_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-region-happy")
|
||||
|
||||
body := map[string]any{
|
||||
"region": "hel1",
|
||||
"skuCP": "cpx21",
|
||||
"skuWorker": "cpx41",
|
||||
"workerCount": 2,
|
||||
}
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "regions", dep.ID, body, func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions", h.CreateInfrastructureRegion)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.XRCKind != infrastructure.KindRegionClaim {
|
||||
t.Fatalf("xrcKind: got %q want %q", out.XRCKind, infrastructure.KindRegionClaim)
|
||||
}
|
||||
if !strings.Contains(out.XRCName, "region-hel1") {
|
||||
t.Fatalf("xrcName must contain 'region-hel1': got %q", out.XRCName)
|
||||
}
|
||||
if out.JobID == "" {
|
||||
t.Fatalf("jobId must be set")
|
||||
}
|
||||
if out.Status != "submitted-pending-composition" {
|
||||
t.Fatalf("status: got %q want submitted-pending-composition", out.Status)
|
||||
}
|
||||
}
|
||||
|
||||
func TestCreateRegion_NotFound(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "regions", "ghost",
|
||||
map[string]any{"region": "hel1", "skuCP": "cpx21"},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions", h.CreateInfrastructureRegion)
|
||||
})
|
||||
if rec.Code != http.StatusNotFound {
|
||||
t.Fatalf("status: got %d want 404; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
func TestCreateRegion_503WhenKubeconfigMissing(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
// Build a deployment WITHOUT a kubeconfig path so the
|
||||
// sovereignDynamicClient short-circuits with 503.
|
||||
dep := &Deployment{
|
||||
ID: "dep-no-kubeconfig",
|
||||
Status: "ready",
|
||||
Request: provisioner.Request{
|
||||
SovereignFQDN: "x.example",
|
||||
Region: "fsn1",
|
||||
ControlPlaneSize: "cpx21",
|
||||
},
|
||||
Result: &provisioner.Result{
|
||||
// Intentionally empty KubeconfigPath
|
||||
},
|
||||
mu: sync.Mutex{},
|
||||
}
|
||||
h.deployments.Store(dep.ID, dep)
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "regions", dep.ID,
|
||||
map[string]any{"region": "hel1", "skuCP": "cpx21"},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions", h.CreateInfrastructureRegion)
|
||||
})
|
||||
if rec.Code != http.StatusServiceUnavailable {
|
||||
t.Fatalf("status: got %d want 503; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
if !strings.Contains(rec.Body.String(), "sovereign-cluster-unreachable") {
|
||||
t.Fatalf("expected sovereign-cluster-unreachable body; got %s", rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
func TestCreateRegion_409Conflict(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
|
||||
// Pre-seed the fake dynamic client with the EXACT same XRC the
|
||||
// handler will compute from (depID, "region", "hel1"). The
|
||||
// second create() then returns AlreadyExists which the helper
|
||||
// surfaces as ErrXRCNameConflict → HTTP 409.
|
||||
depID := "dep-region-conflict"
|
||||
wantName := infrastructure.XRCName(depID, "region", "hel1")
|
||||
existing := newUnstructuredXRC(infrastructure.KindRegionClaim, wantName)
|
||||
h.dynamicFactory = fakeXRCDynamicFactory(existing)
|
||||
|
||||
dep := installCRUDDeployment(t, h, depID)
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "regions", dep.ID,
|
||||
map[string]any{"region": "hel1", "skuCP": "cpx21"},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions", h.CreateInfrastructureRegion)
|
||||
})
|
||||
if rec.Code != http.StatusConflict {
|
||||
t.Fatalf("status: got %d want 409; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
if !strings.Contains(rec.Body.String(), "xrc-name-conflict") {
|
||||
t.Fatalf("expected xrc-name-conflict; got %s", rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/regions/{id}/clusters ──────────────── */
|
||||
|
||||
func TestCreateCluster_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-cluster-happy")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "regions/region-fsn1/clusters", dep.ID,
|
||||
map[string]any{"name": "edge-1", "version": "v1.30", "ha": false},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions/{id}/clusters", h.CreateInfrastructureCluster)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.XRCKind != infrastructure.KindClusterClaim {
|
||||
t.Fatalf("xrcKind: got %q want %q", out.XRCKind, infrastructure.KindClusterClaim)
|
||||
}
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/clusters/{id}/vclusters ────────────── */
|
||||
|
||||
func TestCreateVCluster_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-vcluster-happy")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "clusters/cluster-x/vclusters", dep.ID,
|
||||
map[string]any{"name": "dmz", "namespace": "dmz", "role": "dmz"},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/clusters/{id}/vclusters", h.CreateInfrastructureVCluster)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.XRCKind != infrastructure.KindVClusterClaim {
|
||||
t.Fatalf("xrcKind: got %q want %q", out.XRCKind, infrastructure.KindVClusterClaim)
|
||||
}
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/clusters/{id}/pools ────────────────── */
|
||||
|
||||
func TestCreatePool_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-pool-happy")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "clusters/cluster-x/pools", dep.ID,
|
||||
map[string]any{"name": "gpu-1", "role": "worker", "sku": "cpx51", "region": "fsn1", "desiredSize": 3},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/clusters/{id}/pools", h.CreateInfrastructurePool)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.XRCKind != infrastructure.KindNodePoolClaim {
|
||||
t.Fatalf("xrcKind: got %q want %q", out.XRCKind, infrastructure.KindNodePoolClaim)
|
||||
}
|
||||
}
|
||||
|
||||
/* ── PATCH /infrastructure/pools/{id} ─────────────────────────── */
|
||||
|
||||
func TestPatchPool_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-pool-patch")
|
||||
|
||||
size := 5
|
||||
rec := callCRUDInfra(t, h, http.MethodPatch, "pools/gpu-1", dep.ID,
|
||||
map[string]any{"desiredSize": size},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Patch("/api/v1/deployments/{depId}/infrastructure/pools/{id}", h.PatchInfrastructurePool)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.XRCKind != infrastructure.KindNodePoolClaim {
|
||||
t.Fatalf("xrcKind: got %q want %q", out.XRCKind, infrastructure.KindNodePoolClaim)
|
||||
}
|
||||
}
|
||||
|
||||
func TestPatchPool_ConflictBecomes202(t *testing.T) {
|
||||
// PATCH semantics: an existing claim with the same name is the
|
||||
// expected case (PATCH targets convergence). Handler must NOT
|
||||
// surface 409 for the patch path — it's 202.
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
depID := "dep-pool-patch-conflict"
|
||||
xrcName := infrastructure.XRCName(depID, "pool", "gpu-1")
|
||||
existing := newUnstructuredXRC(infrastructure.KindNodePoolClaim, xrcName)
|
||||
h.dynamicFactory = fakeXRCDynamicFactory(existing)
|
||||
dep := installCRUDDeployment(t, h, depID)
|
||||
|
||||
size := 5
|
||||
rec := callCRUDInfra(t, h, http.MethodPatch, "pools/gpu-1", dep.ID,
|
||||
map[string]any{"desiredSize": size},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Patch("/api/v1/deployments/{depId}/infrastructure/pools/{id}", h.PatchInfrastructurePool)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/loadbalancers ──────────────────────── */
|
||||
|
||||
func TestCreateLB_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-lb-happy")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "loadbalancers", dep.ID,
|
||||
map[string]any{"name": "edge-lb", "region": "hel1", "ports": "443"},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/loadbalancers", h.CreateInfrastructureLoadBalancer)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/peerings ────────────────────────────── */
|
||||
|
||||
func TestCreatePeering_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-peering-happy")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "peerings", dep.ID,
|
||||
map[string]any{"name": "fsn1-hel1", "vpcFrom": "vpc-fsn1", "vpcTo": "vpc-hel1", "subnets": "10.0.0.0/16<>10.1.0.0/16"},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/peerings", h.CreateInfrastructurePeering)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/firewalls/{id}/rules ───────────────── */
|
||||
|
||||
func TestCreateFirewallRule_Happy(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-fw-happy")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "firewalls/fw-1/rules", dep.ID,
|
||||
map[string]any{"direction": "in", "protocol": "tcp", "port": "443", "sources": "0.0.0.0/0", "action": "accept"},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/firewalls/{id}/rules", h.CreateInfrastructureFirewallRule)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
/* ── POST /infrastructure/nodes/{id}/{action} ─────────────────── */
|
||||
|
||||
func TestCreateNodeAction_HappyDrain(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-node-drain")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "nodes/node-w-0/drain", dep.ID, nil,
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/nodes/{id}/{action}", h.CreateInfrastructureNodeAction)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.XRCKind != infrastructure.KindNodeActionClaim {
|
||||
t.Fatalf("xrcKind: got %q want %q", out.XRCKind, infrastructure.KindNodeActionClaim)
|
||||
}
|
||||
}
|
||||
|
||||
func TestCreateNodeAction_BadAction(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-node-bad")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "nodes/node-w-0/yeet", dep.ID, nil,
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/nodes/{id}/{action}", h.CreateInfrastructureNodeAction)
|
||||
})
|
||||
if rec.Code != http.StatusBadRequest {
|
||||
t.Fatalf("status: got %d want 400", rec.Code)
|
||||
}
|
||||
}
|
||||
|
||||
/* ── DELETE /infrastructure/{kind}/{id} ───────────────────────── */
|
||||
|
||||
func TestDeleteResource_HappyWithCascade(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
|
||||
// Pre-seed an XRC the DELETE call can find. The CascadeFor
|
||||
// helper composes the cascade rows from the LIVE topology
|
||||
// (which today doesn't include the XRC's children — it pulls
|
||||
// from the deployment record), so the cascade always emits at
|
||||
// least one descriptor row.
|
||||
depID := "dep-region-delete"
|
||||
xrcName := infrastructure.XRCName(depID, "region", "fsn1")
|
||||
existing := newUnstructuredXRC(infrastructure.KindRegionClaim, xrcName)
|
||||
h.dynamicFactory = fakeXRCDynamicFactory(existing)
|
||||
|
||||
dep := installCRUDDeployment(t, h, depID)
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodDelete, "regions/fsn1", dep.ID, nil,
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Delete("/api/v1/deployments/{depId}/infrastructure/{kind}/{id}", h.DeleteInfrastructureResource)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.XRCKind != infrastructure.KindRegionClaim {
|
||||
t.Fatalf("xrcKind: got %q want %q", out.XRCKind, infrastructure.KindRegionClaim)
|
||||
}
|
||||
if len(out.Cascade) == 0 {
|
||||
t.Fatalf("expected non-empty cascade preview; got %v", out.Cascade)
|
||||
}
|
||||
}
|
||||
|
||||
func TestDeleteResource_UnknownKind(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-bad-kind")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodDelete, "widgets/foo", dep.ID, nil,
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Delete("/api/v1/deployments/{depId}/infrastructure/{kind}/{id}", h.DeleteInfrastructureResource)
|
||||
})
|
||||
if rec.Code != http.StatusBadRequest {
|
||||
t.Fatalf("status: got %d want 400; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
}
|
||||
|
||||
func TestDeleteResource_AlreadyAbsent(t *testing.T) {
|
||||
// Delete returning NotFound is treated as "already gone" — the
|
||||
// audit Job is still committed and the response is 202 with
|
||||
// status="already-absent" so the FE can re-render.
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-region-already-gone")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodDelete, "regions/fsn1", dep.ID, nil,
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Delete("/api/v1/deployments/{depId}/infrastructure/{kind}/{id}", h.DeleteInfrastructureResource)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
out := mustDecodeMutation(t, rec)
|
||||
if out.Status != "already-absent" {
|
||||
t.Fatalf("status: got %q want already-absent", out.Status)
|
||||
}
|
||||
}
|
||||
|
||||
/* ── Audit-trail end-to-end: mutation Job materialised ───────── */
|
||||
|
||||
// TestCreateRegion_AuditJobMaterialised — after a successful create,
|
||||
// the catalyst-api MUST have committed a Job + Execution + LogLines
|
||||
// to the jobs Store under the current deployment id. This is the
|
||||
// audit-trail invariant: every Day-2 mutation is observable via the
|
||||
// existing /jobs surface.
|
||||
func TestCreateRegion_AuditJobMaterialised(t *testing.T) {
|
||||
dir := t.TempDir()
|
||||
js, err := jobs.NewStore(dir)
|
||||
if err != nil {
|
||||
t.Fatalf("jobs.NewStore: %v", err)
|
||||
}
|
||||
h := NewWithJobsStore(silentLogger(), js)
|
||||
h.dynamicFactory = fakeXRCDynamicFactory()
|
||||
dep := installCRUDDeployment(t, h, "dep-audit-region")
|
||||
|
||||
rec := callCRUDInfra(t, h, http.MethodPost, "regions", dep.ID,
|
||||
map[string]any{"region": "hel1", "skuCP": "cpx21", "workerCount": 2},
|
||||
func(r chi.Router, h *Handler) {
|
||||
r.Post("/api/v1/deployments/{depId}/infrastructure/regions", h.CreateInfrastructureRegion)
|
||||
})
|
||||
if rec.Code != http.StatusAccepted {
|
||||
t.Fatalf("status: got %d want 202; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
|
||||
jobsList, err := js.ListJobs(dep.ID)
|
||||
if err != nil {
|
||||
t.Fatalf("ListJobs: %v", err)
|
||||
}
|
||||
if len(jobsList) == 0 {
|
||||
t.Fatalf("expected at least one mutation Job committed; got none")
|
||||
}
|
||||
found := false
|
||||
for _, j := range jobsList {
|
||||
if strings.HasPrefix(j.JobName, jobs.MutationJobNamePrefix) && j.BatchID == jobs.BatchDay2Mutations {
|
||||
found = true
|
||||
if j.Status != jobs.StatusSucceeded {
|
||||
t.Fatalf("mutation Job status: got %q want %q", j.Status, jobs.StatusSucceeded)
|
||||
}
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
t.Fatalf("expected a Job with name prefix %q and batch %q; got %+v", jobs.MutationJobNamePrefix, jobs.BatchDay2Mutations, jobsList)
|
||||
}
|
||||
}
|
||||
|
||||
/* ── Helpers ──────────────────────────────────────────────────── */
|
||||
|
||||
// newUnstructuredXRC builds a bare XRC unstructured suitable for
|
||||
// pre-seeding the fake dynamic client. Only metadata.name +
|
||||
// apiVersion + kind are populated; tests don't assert on .spec
|
||||
// content.
|
||||
func newUnstructuredXRC(kind, name string) *unstructured.Unstructured {
|
||||
u := &unstructured.Unstructured{}
|
||||
u.SetAPIVersion(infrastructure.XRCAPIGroup + "/" + infrastructure.XRCAPIVersion)
|
||||
u.SetKind(kind)
|
||||
u.SetNamespace(infrastructure.XRCNamespace)
|
||||
u.SetName(name)
|
||||
return u
|
||||
}
|
||||
|
||||
// silenceUnused — keep the metav1 + context imports anchored even
|
||||
// when the test file's surface evolves.
|
||||
var (
|
||||
_ = metav1.ObjectMeta{}
|
||||
_ = context.Background
|
||||
)
|
||||
@ -1,7 +1,11 @@
|
||||
// infrastructure_test.go — coverage for the Sovereign Infrastructure
|
||||
// REST surface. Pins the wire shape every endpoint emits + the 404
|
||||
// path so the UI's contract stays stable as the data sources evolve
|
||||
// (today: deployment record only; future: live-cluster kubeconfig).
|
||||
// REST surface.
|
||||
//
|
||||
// The unified GET .../infrastructure/topology emits the hierarchical
|
||||
// TopologyResponse shape (cloud → topology.regions[*] → clusters →
|
||||
// vclusters | pools | nodes | LBs + storage). The legacy flat
|
||||
// /compute, /storage, /network endpoints remain wired with their
|
||||
// pre-existing shapes until the FE migrates to the unified topology.
|
||||
package handler
|
||||
|
||||
import (
|
||||
@ -14,6 +18,7 @@ import (
|
||||
|
||||
"github.com/go-chi/chi/v5"
|
||||
|
||||
"github.com/openova-io/openova/products/catalyst/bootstrap/api/internal/infrastructure"
|
||||
"github.com/openova-io/openova/products/catalyst/bootstrap/api/internal/provisioner"
|
||||
)
|
||||
|
||||
@ -32,6 +37,7 @@ func installInfraDeployment(t *testing.T, h *Handler, status string) (*Deploymen
|
||||
ControlPlaneSize: "cpx21",
|
||||
WorkerSize: "cpx41",
|
||||
WorkerCount: 2,
|
||||
HetznerProjectID: "test-project",
|
||||
},
|
||||
mu: sync.Mutex{},
|
||||
}
|
||||
@ -63,6 +69,9 @@ func TestInfrastructureTopology_NotFound(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestInfrastructureTopology_OKShape pins the unified hierarchical
|
||||
// shape: cloud, topology.regions[*].clusters[*].nodes[*]/pools[*]/LBs,
|
||||
// storage. The legacy flat nodes/edges shape is no longer emitted.
|
||||
func TestInfrastructureTopology_OKShape(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
dep, id := installInfraDeployment(t, h, "ready")
|
||||
@ -76,37 +85,61 @@ func TestInfrastructureTopology_OKShape(t *testing.T) {
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("status: got %d want 200; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
var out infraTopologyResponse
|
||||
var out infrastructure.TopologyResponse
|
||||
if err := json.Unmarshal(rec.Body.Bytes(), &out); err != nil {
|
||||
t.Fatalf("decode: %v", err)
|
||||
}
|
||||
if len(out.Nodes) == 0 {
|
||||
t.Fatalf("expected non-empty nodes")
|
||||
|
||||
// Cloud — exactly one tenant per provider.
|
||||
if len(out.Cloud) != 1 {
|
||||
t.Fatalf("expected 1 cloud tenant; got %d", len(out.Cloud))
|
||||
}
|
||||
if len(out.Edges) == 0 {
|
||||
t.Fatalf("expected non-empty edges")
|
||||
if out.Cloud[0].Provider != "hetzner" {
|
||||
t.Fatalf("cloud provider: got %q want hetzner", out.Cloud[0].Provider)
|
||||
}
|
||||
if out.Cloud[0].ProjectID != "test-project" {
|
||||
t.Fatalf("cloud projectID: got %q want test-project", out.Cloud[0].ProjectID)
|
||||
}
|
||||
|
||||
// Cloud + cluster + LB + workers must all surface. Spot-check kinds.
|
||||
kinds := map[string]int{}
|
||||
for _, n := range out.Nodes {
|
||||
kinds[n.Kind]++
|
||||
// Topology — pattern + 1 region (legacy singular path).
|
||||
if out.Topology.Pattern == "" {
|
||||
t.Fatalf("topology pattern is required")
|
||||
}
|
||||
if kinds["cloud"] != 1 {
|
||||
t.Fatalf("expected 1 cloud node; got %d", kinds["cloud"])
|
||||
if len(out.Topology.Regions) != 1 {
|
||||
t.Fatalf("expected 1 region; got %d", len(out.Topology.Regions))
|
||||
}
|
||||
if kinds["cluster"] != 1 {
|
||||
t.Fatalf("expected 1 cluster node; got %d", kinds["cluster"])
|
||||
rg := out.Topology.Regions[0]
|
||||
if rg.ProviderRegion != "fsn1" {
|
||||
t.Fatalf("region.providerRegion: got %q want fsn1", rg.ProviderRegion)
|
||||
}
|
||||
if kinds["lb"] != 1 {
|
||||
t.Fatalf("expected 1 lb node when LoadBalancerIP set; got %d", kinds["lb"])
|
||||
if rg.SkuCP != "cpx21" {
|
||||
t.Fatalf("region.skuCP: got %q want cpx21", rg.SkuCP)
|
||||
}
|
||||
// At least the workers + control plane.
|
||||
if kinds["node"] < 3 {
|
||||
t.Fatalf("expected >=3 node entries (1 cp + 2 workers); got %d", kinds["node"])
|
||||
if len(rg.Clusters) != 1 {
|
||||
t.Fatalf("expected 1 cluster per region; got %d", len(rg.Clusters))
|
||||
}
|
||||
c := rg.Clusters[0]
|
||||
if c.Name != "omantel.omani.works" {
|
||||
t.Fatalf("cluster.name: got %q want omantel.omani.works", c.Name)
|
||||
}
|
||||
// 1 cp + 2 workers
|
||||
if len(c.Nodes) != 3 {
|
||||
t.Fatalf("expected 3 nodes (1 cp + 2 workers); got %d", len(c.Nodes))
|
||||
}
|
||||
if len(c.LoadBalancers) != 1 {
|
||||
t.Fatalf("expected 1 LB when LoadBalancerIP set; got %d", len(c.LoadBalancers))
|
||||
}
|
||||
if c.LoadBalancers[0].PublicIP != "203.0.113.10" {
|
||||
t.Fatalf("lb publicIP: got %q want 203.0.113.10", c.LoadBalancers[0].PublicIP)
|
||||
}
|
||||
// node pools: 1 cp pool + 1 worker pool
|
||||
if len(c.NodePools) != 2 {
|
||||
t.Fatalf("expected 2 node pools (cp + worker); got %d", len(c.NodePools))
|
||||
}
|
||||
}
|
||||
|
||||
// TestInfrastructureTopology_NoLBWhenAbsent — pre-LB-reconcile
|
||||
// deployment must not surface a synthesised LB row.
|
||||
func TestInfrastructureTopology_NoLBWhenAbsent(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
_, id := installInfraDeployment(t, h, "provisioning")
|
||||
@ -114,11 +147,56 @@ func TestInfrastructureTopology_NoLBWhenAbsent(t *testing.T) {
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("status: got %d want 200; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
var out infraTopologyResponse
|
||||
var out infrastructure.TopologyResponse
|
||||
_ = json.Unmarshal(rec.Body.Bytes(), &out)
|
||||
for _, n := range out.Nodes {
|
||||
if n.Kind == "lb" {
|
||||
t.Fatalf("expected no lb node before LoadBalancerIP is reported; got %+v", n)
|
||||
if len(out.Topology.Regions) != 1 {
|
||||
t.Fatalf("expected 1 region; got %d", len(out.Topology.Regions))
|
||||
}
|
||||
if len(out.Topology.Regions[0].Clusters) == 0 {
|
||||
t.Fatal("expected 1 cluster")
|
||||
}
|
||||
if len(out.Topology.Regions[0].Clusters[0].LoadBalancers) != 0 {
|
||||
t.Fatalf("expected no LBs before LoadBalancerIP reported; got %+v", out.Topology.Regions[0].Clusters[0].LoadBalancers)
|
||||
}
|
||||
}
|
||||
|
||||
// TestInfrastructureTopology_StorageEmptyFallback — storage arrays
|
||||
// MUST serialise as `[]` (never null) so the FE can iterate them.
|
||||
func TestInfrastructureTopology_StorageEmptyFallback(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
_, id := installInfraDeployment(t, h, "ready")
|
||||
rec := callInfra(t, h, http.MethodGet, "topology", id, h.GetInfrastructureTopology)
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("status: got %d want 200; body=%s", rec.Code, rec.Body.String())
|
||||
}
|
||||
body := rec.Body.String()
|
||||
if !strings.Contains(body, `"pvcs":[]`) {
|
||||
t.Fatalf("storage.pvcs must serialise as []; body=%s", body)
|
||||
}
|
||||
if !strings.Contains(body, `"buckets":[]`) {
|
||||
t.Fatalf("storage.buckets must serialise as []; body=%s", body)
|
||||
}
|
||||
if !strings.Contains(body, `"volumes":[]`) {
|
||||
t.Fatalf("storage.volumes must serialise as []; body=%s", body)
|
||||
}
|
||||
}
|
||||
|
||||
// TestInfrastructureTopology_PeeringsEmptyByDefault — when no
|
||||
// PeeringClaim XRCs exist, the loader emits an empty peerings array.
|
||||
func TestInfrastructureTopology_PeeringsEmptyByDefault(t *testing.T) {
|
||||
h := NewWithPDM(silentLogger(), &fakePDM{})
|
||||
_, id := installInfraDeployment(t, h, "ready")
|
||||
rec := callInfra(t, h, http.MethodGet, "topology", id, h.GetInfrastructureTopology)
|
||||
if rec.Code != http.StatusOK {
|
||||
t.Fatalf("status: got %d", rec.Code)
|
||||
}
|
||||
var out infrastructure.TopologyResponse
|
||||
_ = json.Unmarshal(rec.Body.Bytes(), &out)
|
||||
for _, rg := range out.Topology.Regions {
|
||||
for _, n := range rg.Networks {
|
||||
if n.Peerings == nil {
|
||||
t.Fatalf("network.peerings must be [] not null")
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -174,7 +252,6 @@ func TestInfrastructureStorage_OKEmpty(t *testing.T) {
|
||||
if len(out.PVCs) != 0 || len(out.Buckets) != 0 || len(out.Volumes) != 0 {
|
||||
t.Fatalf("expected empty arrays for live-cluster sourced data; got %+v", out)
|
||||
}
|
||||
// JSON arrays MUST be `[]` not `null` so the UI can iterate them.
|
||||
body := rec.Body.String()
|
||||
if !strings.Contains(body, `"pvcs":[]`) {
|
||||
t.Fatalf("pvcs field must serialise as `[]`, got body=%s", body)
|
||||
|
||||
@ -0,0 +1,623 @@
|
||||
// topology_loader.go — composes the unified TopologyResponse from the
|
||||
// three available data sources:
|
||||
//
|
||||
// 1. The deployment record's Phase-0 OpenTofu outputs (provisioner.
|
||||
// Result + Request) — always available post-Phase-0; carries
|
||||
// control-plane IP, load-balancer IP, declared region SKUs, and
|
||||
// declared worker counts.
|
||||
//
|
||||
// 2. The live Sovereign cluster's dynamic informer cache — populated
|
||||
// by the helmwatch.Watcher attached to this deployment. Reads
|
||||
// vcluster.io/v1alpha1 VClusters when the operator is installed
|
||||
// plus core/v1 PVCs from the live cluster.
|
||||
//
|
||||
// 3. The Crossplane managed-resource list — surfaces XRCs the
|
||||
// catalyst-api itself wrote. Populated by the same dynamic
|
||||
// client; empty when no claims exist.
|
||||
//
|
||||
// Per docs/INVIOLABLE-PRINCIPLES.md (no placeholder data) every
|
||||
// per-source query that fails or returns empty results in an empty
|
||||
// slice on the response — never a synthesised row.
|
||||
package infrastructure
|
||||
|
||||
import (
|
||||
"context"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/runtime/schema"
|
||||
"k8s.io/client-go/dynamic"
|
||||
|
||||
"github.com/openova-io/openova/products/catalyst/bootstrap/api/internal/provisioner"
|
||||
)
|
||||
|
||||
// LoaderInput — the deployment-shaped data the handler hands to the
|
||||
// loader. The loader does not import the handler package (would
|
||||
// create a cycle); the handler unwraps Deployment fields onto this
|
||||
// struct and calls Load.
|
||||
type LoaderInput struct {
|
||||
DeploymentID string
|
||||
Status string // canonical UI status
|
||||
SovereignFQDN string
|
||||
Provider string
|
||||
Region string
|
||||
Regions []provisioner.RegionSpec
|
||||
WorkerCount int
|
||||
WorkerSize string
|
||||
CPSize string
|
||||
Result *provisioner.Result
|
||||
HetznerProjectID string
|
||||
|
||||
// DynamicClient — Sovereign cluster dynamic client, built from
|
||||
// the persisted kubeconfig by the live-watcher. Nil when the
|
||||
// kubeconfig hasn't been postedback yet — the loader emits empty
|
||||
// arrays for live-source fields in that case.
|
||||
DynamicClient dynamic.Interface
|
||||
}
|
||||
|
||||
// Load composes the unified TopologyResponse. The function is
|
||||
// allocation-light by design — every slice is pre-sized off the
|
||||
// request shape so the typical 1-region happy-path emits a single
|
||||
// allocation per per-region child.
|
||||
func Load(ctx context.Context, in LoaderInput) TopologyResponse {
|
||||
if ctx == nil {
|
||||
ctx = context.Background()
|
||||
}
|
||||
cloud := buildCloud(in)
|
||||
topology := buildTopology(ctx, in)
|
||||
storage := buildStorage(ctx, in)
|
||||
return TopologyResponse{
|
||||
Cloud: cloud,
|
||||
Topology: topology,
|
||||
Storage: storage,
|
||||
}
|
||||
}
|
||||
|
||||
// buildCloud — one tenant per cloud provider. Today every Sovereign
|
||||
// runs against exactly one Hetzner project; multi-cloud will add
|
||||
// per-provider entries.
|
||||
func buildCloud(in LoaderInput) []CloudTenant {
|
||||
provider := in.Provider
|
||||
if provider == "" {
|
||||
provider = "hetzner"
|
||||
}
|
||||
tenant := CloudTenant{
|
||||
ID: "cloud-" + provider,
|
||||
Provider: provider,
|
||||
Name: provider,
|
||||
Status: in.Status,
|
||||
ProjectID: in.HetznerProjectID,
|
||||
}
|
||||
return []CloudTenant{tenant}
|
||||
}
|
||||
|
||||
// buildTopology — pattern + per-region build-out. One Region row per
|
||||
// Regions[*] entry; legacy single-region path uses the singular
|
||||
// Request fields.
|
||||
func buildTopology(ctx context.Context, in LoaderInput) TopologyData {
|
||||
pattern := derivePattern(in)
|
||||
|
||||
regions := []Region{}
|
||||
if len(in.Regions) > 0 {
|
||||
for _, rs := range in.Regions {
|
||||
regions = append(regions, buildRegion(ctx, in, rs))
|
||||
}
|
||||
} else if in.Region != "" {
|
||||
// Legacy singular path — pre-multi-region wizard payload.
|
||||
legacy := provisioner.RegionSpec{
|
||||
Provider: in.Provider,
|
||||
CloudRegion: in.Region,
|
||||
ControlPlaneSize: in.CPSize,
|
||||
WorkerSize: in.WorkerSize,
|
||||
WorkerCount: in.WorkerCount,
|
||||
}
|
||||
regions = append(regions, buildRegion(ctx, in, legacy))
|
||||
}
|
||||
return TopologyData{
|
||||
Pattern: pattern,
|
||||
Regions: regions,
|
||||
}
|
||||
}
|
||||
|
||||
func derivePattern(in LoaderInput) string {
|
||||
switch {
|
||||
case len(in.Regions) > 1:
|
||||
return "multi-region"
|
||||
case len(in.Regions) == 1 && in.Regions[0].WorkerCount >= 3:
|
||||
return "ha-pair"
|
||||
case len(in.Regions) == 1:
|
||||
return "solo"
|
||||
case in.Region != "":
|
||||
return "solo"
|
||||
default:
|
||||
return "unknown"
|
||||
}
|
||||
}
|
||||
|
||||
func buildRegion(ctx context.Context, in LoaderInput, rs provisioner.RegionSpec) Region {
|
||||
provider := rs.Provider
|
||||
if provider == "" {
|
||||
provider = "hetzner"
|
||||
}
|
||||
regionID := "region-" + rs.CloudRegion
|
||||
|
||||
cluster := buildCluster(ctx, in, rs)
|
||||
networks := buildNetworks(ctx, in, rs)
|
||||
|
||||
return Region{
|
||||
ID: regionID,
|
||||
Name: rs.CloudRegion,
|
||||
Provider: provider,
|
||||
ProviderRegion: rs.CloudRegion,
|
||||
SkuCP: rs.ControlPlaneSize,
|
||||
SkuWorker: rs.WorkerSize,
|
||||
WorkerCount: rs.WorkerCount,
|
||||
Status: in.Status,
|
||||
Clusters: []Cluster{cluster},
|
||||
Networks: networks,
|
||||
}
|
||||
}
|
||||
|
||||
func buildCluster(ctx context.Context, in LoaderInput, rs provisioner.RegionSpec) Cluster {
|
||||
clusterName := in.SovereignFQDN
|
||||
if clusterName == "" {
|
||||
dep := in.DeploymentID
|
||||
if len(dep) > 8 {
|
||||
dep = dep[:8]
|
||||
}
|
||||
clusterName = "cluster-" + dep
|
||||
}
|
||||
clusterID := "cluster-" + in.DeploymentID + "-" + rs.CloudRegion
|
||||
if rs.CloudRegion == "" {
|
||||
clusterID = "cluster-" + in.DeploymentID
|
||||
}
|
||||
|
||||
nodes := buildNodes(in, rs)
|
||||
pools := buildNodePools(in, rs)
|
||||
lbs := buildLBs(in, rs)
|
||||
vclusters := loadVClusters(ctx, in)
|
||||
|
||||
return Cluster{
|
||||
ID: clusterID,
|
||||
Name: clusterName,
|
||||
Version: "v1.30",
|
||||
Status: in.Status,
|
||||
NodeCount: len(nodes),
|
||||
VClusters: vclusters,
|
||||
LoadBalancers: lbs,
|
||||
NodePools: pools,
|
||||
Nodes: nodes,
|
||||
}
|
||||
}
|
||||
|
||||
func buildNodes(in LoaderInput, rs provisioner.RegionSpec) []Node {
|
||||
out := []Node{}
|
||||
|
||||
cpIP := ""
|
||||
if in.Result != nil {
|
||||
cpIP = in.Result.ControlPlaneIP
|
||||
}
|
||||
cpID := "node-cp-" + rs.CloudRegion
|
||||
if rs.CloudRegion == "" {
|
||||
cpID = "node-cp-" + in.DeploymentID
|
||||
}
|
||||
out = append(out, Node{
|
||||
ID: cpID,
|
||||
Name: "control-plane-" + rs.CloudRegion,
|
||||
SKU: rs.ControlPlaneSize,
|
||||
Region: rs.CloudRegion,
|
||||
Role: "control-plane",
|
||||
IP: cpIP,
|
||||
Status: in.Status,
|
||||
NodePoolID: "pool-cp-" + rs.CloudRegion,
|
||||
})
|
||||
|
||||
for i := 0; i < rs.WorkerCount; i++ {
|
||||
wID := "node-w-" + itoa(i) + "-" + rs.CloudRegion
|
||||
if rs.CloudRegion == "" {
|
||||
wID = "node-w-" + itoa(i) + "-" + in.DeploymentID
|
||||
}
|
||||
out = append(out, Node{
|
||||
ID: wID,
|
||||
Name: "worker-" + itoa(i+1) + "-" + rs.CloudRegion,
|
||||
SKU: rs.WorkerSize,
|
||||
Region: rs.CloudRegion,
|
||||
Role: "worker",
|
||||
IP: "",
|
||||
Status: in.Status,
|
||||
NodePoolID: "pool-worker-" + rs.CloudRegion,
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func buildNodePools(in LoaderInput, rs provisioner.RegionSpec) []NodePool {
|
||||
pools := []NodePool{
|
||||
{
|
||||
ID: "pool-cp-" + rs.CloudRegion,
|
||||
Name: "control-plane-" + rs.CloudRegion,
|
||||
Role: "control-plane",
|
||||
SKU: rs.ControlPlaneSize,
|
||||
Region: rs.CloudRegion,
|
||||
DesiredSize: 1,
|
||||
CurrentSize: 1,
|
||||
Status: in.Status,
|
||||
},
|
||||
}
|
||||
if rs.WorkerCount > 0 {
|
||||
pools = append(pools, NodePool{
|
||||
ID: "pool-worker-" + rs.CloudRegion,
|
||||
Name: "worker-" + rs.CloudRegion,
|
||||
Role: "worker",
|
||||
SKU: rs.WorkerSize,
|
||||
Region: rs.CloudRegion,
|
||||
DesiredSize: rs.WorkerCount,
|
||||
CurrentSize: rs.WorkerCount,
|
||||
Status: in.Status,
|
||||
})
|
||||
}
|
||||
return pools
|
||||
}
|
||||
|
||||
func buildLBs(in LoaderInput, rs provisioner.RegionSpec) []LoadBalancer {
|
||||
if in.Result == nil || in.Result.LoadBalancerIP == "" {
|
||||
return []LoadBalancer{}
|
||||
}
|
||||
name := in.SovereignFQDN
|
||||
if name == "" {
|
||||
name = "ingress-lb"
|
||||
}
|
||||
return []LoadBalancer{{
|
||||
ID: "lb-" + in.DeploymentID,
|
||||
Name: name,
|
||||
PublicIP: in.Result.LoadBalancerIP,
|
||||
Ports: "80,443,6443",
|
||||
TargetHealth: "—",
|
||||
Region: rs.CloudRegion,
|
||||
Status: in.Status,
|
||||
}}
|
||||
}
|
||||
|
||||
func buildNetworks(ctx context.Context, in LoaderInput, rs provisioner.RegionSpec) []Network {
|
||||
// Per-region VPC stamped by the Phase-0 module; follow-on
|
||||
// Day-2 PeeringClaim XRCs bind regions together. Today we
|
||||
// surface one Network per region with empty Peerings until the
|
||||
// Crossplane Composition lands and Peering objects exist.
|
||||
netID := "net-" + rs.CloudRegion + "-" + in.DeploymentID
|
||||
if rs.CloudRegion == "" {
|
||||
netID = "net-" + in.DeploymentID
|
||||
}
|
||||
return []Network{{
|
||||
ID: netID,
|
||||
Name: "vpc-" + rs.CloudRegion,
|
||||
CIDR: "",
|
||||
Region: rs.CloudRegion,
|
||||
Peerings: loadPeerings(ctx, in, rs),
|
||||
Firewall: nil,
|
||||
Status: in.Status,
|
||||
}}
|
||||
}
|
||||
|
||||
// loadVClusters — query the Sovereign cluster's vcluster.io/v1alpha1
|
||||
// CRs. Returns an empty slice when the operator isn't installed
|
||||
// (Crd doesn't exist) or when no vclusters have been provisioned.
|
||||
//
|
||||
// The recover guard tolerates fake-client panics in unit tests
|
||||
// (k8s.io/client-go/dynamic/fake panics on unregistered list-kinds);
|
||||
// production never hits this path because the real apiserver
|
||||
// returns 404 instead of panicking.
|
||||
func loadVClusters(ctx context.Context, in LoaderInput) (out []VCluster) {
|
||||
out = []VCluster{}
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
out = []VCluster{}
|
||||
}
|
||||
}()
|
||||
if in.DynamicClient == nil {
|
||||
return out
|
||||
}
|
||||
gvr := schema.GroupVersionResource{
|
||||
Group: "vcluster.io",
|
||||
Version: "v1alpha1",
|
||||
Resource: "vclusters",
|
||||
}
|
||||
cctx, cancel := context.WithTimeout(ctx, 5*time.Second)
|
||||
defer cancel()
|
||||
list, err := in.DynamicClient.Resource(gvr).Namespace("").List(cctx, metav1.ListOptions{})
|
||||
if err != nil || list == nil {
|
||||
return out
|
||||
}
|
||||
for _, item := range list.Items {
|
||||
role := vclusterRole(item.GetLabels())
|
||||
out = append(out, VCluster{
|
||||
ID: "vcluster-" + item.GetNamespace() + "-" + item.GetName(),
|
||||
Name: item.GetName(),
|
||||
Namespace: item.GetNamespace(),
|
||||
Role: role,
|
||||
Status: statusFromUnstructured(item.Object),
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func vclusterRole(labels map[string]string) string {
|
||||
if v, ok := labels["catalyst.openova.io/role"]; ok && v != "" {
|
||||
return v
|
||||
}
|
||||
if v, ok := labels["building-block"]; ok && v != "" {
|
||||
return v
|
||||
}
|
||||
return "other"
|
||||
}
|
||||
|
||||
// loadPeerings — query Crossplane PeeringClaim XRCs scoped to this
|
||||
// deployment via the LabelDeploymentID selector.
|
||||
//
|
||||
// The recover guard tolerates fake-client panics in unit tests as
|
||||
// described on loadVClusters.
|
||||
func loadPeerings(ctx context.Context, in LoaderInput, rs provisioner.RegionSpec) (out []Peering) {
|
||||
out = []Peering{}
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
out = []Peering{}
|
||||
}
|
||||
}()
|
||||
if in.DynamicClient == nil {
|
||||
return out
|
||||
}
|
||||
gvr := gvrForKind(KindPeeringClaim)
|
||||
cctx, cancel := context.WithTimeout(ctx, 5*time.Second)
|
||||
defer cancel()
|
||||
list, err := in.DynamicClient.Resource(gvr).Namespace(XRCNamespace).List(cctx, metav1.ListOptions{
|
||||
LabelSelector: LabelDeploymentID + "=" + in.DeploymentID,
|
||||
})
|
||||
if err != nil || list == nil {
|
||||
return out
|
||||
}
|
||||
for _, item := range list.Items {
|
||||
spec, _, _ := nestedMap(item.Object, "spec")
|
||||
out = append(out, Peering{
|
||||
ID: string(item.GetUID()),
|
||||
Name: item.GetName(),
|
||||
VPCPair: stringField(spec, "vpcPair"),
|
||||
Subnets: stringField(spec, "subnets"),
|
||||
Status: statusFromUnstructured(item.Object),
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// buildStorage — PVCs from the live cluster + buckets/volumes from
|
||||
// the Crossplane managed-resource list. Empty slices when sources
|
||||
// aren't reachable.
|
||||
func buildStorage(ctx context.Context, in LoaderInput) StorageData {
|
||||
return StorageData{
|
||||
PVCs: loadPVCs(ctx, in),
|
||||
Buckets: []Bucket{},
|
||||
Volumes: []Volume{},
|
||||
}
|
||||
}
|
||||
|
||||
func loadPVCs(ctx context.Context, in LoaderInput) (out []PVC) {
|
||||
out = []PVC{}
|
||||
defer func() {
|
||||
if r := recover(); r != nil {
|
||||
out = []PVC{}
|
||||
}
|
||||
}()
|
||||
if in.DynamicClient == nil {
|
||||
return out
|
||||
}
|
||||
gvr := schema.GroupVersionResource{
|
||||
Group: "",
|
||||
Version: "v1",
|
||||
Resource: "persistentvolumeclaims",
|
||||
}
|
||||
cctx, cancel := context.WithTimeout(ctx, 5*time.Second)
|
||||
defer cancel()
|
||||
list, err := in.DynamicClient.Resource(gvr).Namespace("").List(cctx, metav1.ListOptions{})
|
||||
if err != nil || list == nil {
|
||||
return out
|
||||
}
|
||||
for _, item := range list.Items {
|
||||
spec, _, _ := nestedMap(item.Object, "spec")
|
||||
status, _, _ := nestedMap(item.Object, "status")
|
||||
capacity := stringField(stringMapField(status, "capacity"), "storage")
|
||||
out = append(out, PVC{
|
||||
ID: string(item.GetUID()),
|
||||
Name: item.GetName(),
|
||||
Namespace: item.GetNamespace(),
|
||||
Capacity: capacity,
|
||||
Used: "",
|
||||
StorageClass: stringField(spec, "storageClassName"),
|
||||
Status: stringField(status, "phase"),
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// CascadeFor — given a delete target (kind + id) and the current
|
||||
// topology, lists the child resources that would be reaped. Used by
|
||||
// the DELETE handler to populate the 202 response's Cascade slice.
|
||||
func CascadeFor(kind, id string, topology TopologyResponse) []CascadeImpact {
|
||||
out := []CascadeImpact{}
|
||||
switch strings.ToLower(kind) {
|
||||
case "region":
|
||||
for _, rg := range topology.Topology.Regions {
|
||||
if rg.ID != id {
|
||||
continue
|
||||
}
|
||||
for _, c := range rg.Clusters {
|
||||
out = append(out, CascadeImpact{
|
||||
Kind: "cluster", ID: c.ID, Name: c.Name,
|
||||
Note: "cluster will drain + be reaped",
|
||||
})
|
||||
for _, np := range c.NodePools {
|
||||
out = append(out, CascadeImpact{
|
||||
Kind: "nodePool", ID: np.ID, Name: np.Name,
|
||||
Note: "node pool will be deleted",
|
||||
})
|
||||
}
|
||||
for _, n := range c.Nodes {
|
||||
out = append(out, CascadeImpact{
|
||||
Kind: "node", ID: n.ID, Name: n.Name,
|
||||
Note: "workloads will be drained",
|
||||
})
|
||||
}
|
||||
for _, lb := range c.LoadBalancers {
|
||||
out = append(out, CascadeImpact{
|
||||
Kind: "lb", ID: lb.ID, Name: lb.Name,
|
||||
Note: "load balancer will be released",
|
||||
})
|
||||
}
|
||||
}
|
||||
for _, n := range rg.Networks {
|
||||
out = append(out, CascadeImpact{
|
||||
Kind: "network", ID: n.ID, Name: n.Name,
|
||||
Note: "VPC will be released; peerings disconnected",
|
||||
})
|
||||
for _, p := range n.Peerings {
|
||||
out = append(out, CascadeImpact{
|
||||
Kind: "peering", ID: p.ID, Name: p.Name,
|
||||
Note: "peering will be torn down",
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
case "cluster":
|
||||
for _, rg := range topology.Topology.Regions {
|
||||
for _, c := range rg.Clusters {
|
||||
if c.ID != id {
|
||||
continue
|
||||
}
|
||||
for _, np := range c.NodePools {
|
||||
out = append(out, CascadeImpact{Kind: "nodePool", ID: np.ID, Name: np.Name})
|
||||
}
|
||||
for _, n := range c.Nodes {
|
||||
out = append(out, CascadeImpact{Kind: "node", ID: n.ID, Name: n.Name})
|
||||
}
|
||||
for _, lb := range c.LoadBalancers {
|
||||
out = append(out, CascadeImpact{Kind: "lb", ID: lb.ID, Name: lb.Name})
|
||||
}
|
||||
}
|
||||
}
|
||||
case "nodepool", "pool":
|
||||
for _, rg := range topology.Topology.Regions {
|
||||
for _, c := range rg.Clusters {
|
||||
for _, np := range c.NodePools {
|
||||
if np.ID != id {
|
||||
continue
|
||||
}
|
||||
for _, n := range c.Nodes {
|
||||
if n.NodePoolID == np.ID {
|
||||
out = append(out, CascadeImpact{Kind: "node", ID: n.ID, Name: n.Name,
|
||||
Note: "node will be drained + cordoned"})
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
// Always emit at least one descriptor so the FE confirm dialog
|
||||
// can render a row even when no children are observable.
|
||||
if len(out) == 0 {
|
||||
out = append(out, CascadeImpact{
|
||||
Kind: kind,
|
||||
ID: id,
|
||||
Name: id,
|
||||
Note: "no observable child resources — proceeding will reap the underlying cloud resources",
|
||||
})
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
/* ─── Helpers (no client-go mutation here; reads only) ─── */
|
||||
|
||||
func itoa(n int) string {
|
||||
if n == 0 {
|
||||
return "0"
|
||||
}
|
||||
neg := false
|
||||
if n < 0 {
|
||||
neg = true
|
||||
n = -n
|
||||
}
|
||||
var buf [20]byte
|
||||
i := len(buf)
|
||||
for n > 0 {
|
||||
i--
|
||||
buf[i] = byte('0' + n%10)
|
||||
n /= 10
|
||||
}
|
||||
if neg {
|
||||
i--
|
||||
buf[i] = '-'
|
||||
}
|
||||
return string(buf[i:])
|
||||
}
|
||||
|
||||
func nestedMap(obj map[string]any, path ...string) (map[string]any, bool, error) {
|
||||
cur := obj
|
||||
for _, p := range path {
|
||||
v, ok := cur[p]
|
||||
if !ok {
|
||||
return nil, false, nil
|
||||
}
|
||||
m, ok := v.(map[string]any)
|
||||
if !ok {
|
||||
return nil, false, nil
|
||||
}
|
||||
cur = m
|
||||
}
|
||||
return cur, true, nil
|
||||
}
|
||||
|
||||
func stringField(m map[string]any, key string) string {
|
||||
if m == nil {
|
||||
return ""
|
||||
}
|
||||
if v, ok := m[key]; ok {
|
||||
if s, ok := v.(string); ok {
|
||||
return s
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func stringMapField(m map[string]any, key string) map[string]any {
|
||||
if m == nil {
|
||||
return nil
|
||||
}
|
||||
if v, ok := m[key]; ok {
|
||||
if mm, ok := v.(map[string]any); ok {
|
||||
return mm
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func statusFromUnstructured(obj map[string]any) string {
|
||||
status, _, _ := nestedMap(obj, "status")
|
||||
if status == nil {
|
||||
return "unknown"
|
||||
}
|
||||
if phase := stringField(status, "phase"); phase != "" {
|
||||
return phase
|
||||
}
|
||||
if cs, ok := status["conditions"].([]any); ok {
|
||||
for _, c := range cs {
|
||||
cm, ok := c.(map[string]any)
|
||||
if !ok {
|
||||
continue
|
||||
}
|
||||
if stringField(cm, "type") == "Ready" {
|
||||
if stringField(cm, "status") == "True" {
|
||||
return "healthy"
|
||||
}
|
||||
return strings.ToLower(stringField(cm, "reason"))
|
||||
}
|
||||
}
|
||||
}
|
||||
return "unknown"
|
||||
}
|
||||
297
products/catalyst/bootstrap/api/internal/infrastructure/types.go
Normal file
297
products/catalyst/bootstrap/api/internal/infrastructure/types.go
Normal file
@ -0,0 +1,297 @@
|
||||
// Package infrastructure carries the wire-contract types + Crossplane
|
||||
// XRC writer helpers + topology loader the catalyst-api Sovereign
|
||||
// Infrastructure surface emits.
|
||||
//
|
||||
// # Architectural rule (docs/INVIOLABLE-PRINCIPLES.md #3)
|
||||
//
|
||||
// All Day-2 mutations MUST go through Crossplane. The catalyst-api
|
||||
// writes a Crossplane Composite Resource Claim (XRC) into the Sovereign
|
||||
// cluster (NOT contabo-mkt) and returns 202. The Crossplane provider
|
||||
// does the cloud work. The UI watches the Job that wraps the XRC
|
||||
// submission via the existing Jobs/Executions surface (issue #205).
|
||||
//
|
||||
// catalyst-api MUST NOT call hcloud-go, NEVER `exec.Command("kubectl",
|
||||
// ...)` for mutation, NEVER use client-go for direct mutation of cluster
|
||||
// resources outside the XRC-write path. The dynamic client created from
|
||||
// the deployment's persisted kubeconfig is the ONLY mutation seam.
|
||||
//
|
||||
// # Wire contract — TopologyResponse
|
||||
//
|
||||
// The unified GET /api/v1/deployments/{id}/infrastructure/topology
|
||||
// returns the WHOLE hierarchical tree in ONE shape so the four
|
||||
// frontend tabs (Topology / Compute / Storage / Network) all render
|
||||
// filtered views off a single response — no per-tab fetches, no
|
||||
// cross-fetch coordination state on the client side.
|
||||
//
|
||||
// # Empty fallback (founder principle: never placeholder data)
|
||||
//
|
||||
// When live data isn't available (vCluster CRs not present, peerings
|
||||
// not provisioned yet), the loader returns empty arrays — never
|
||||
// placeholder rows. The frontend's empty-card UX is the canonical
|
||||
// surface for that state.
|
||||
package infrastructure
|
||||
|
||||
import "time"
|
||||
|
||||
// TopologyResponse — unified hierarchical view of the Sovereign's
|
||||
// infrastructure. The four tabs filter views off this single shape.
|
||||
type TopologyResponse struct {
|
||||
// Cloud — list of cloud-provider tenants behind this Sovereign.
|
||||
// Today we model one Hetzner project per deployment; future
|
||||
// multi-cloud Sovereigns will surface multiple tenants here.
|
||||
Cloud []CloudTenant `json:"cloud"`
|
||||
|
||||
// Topology — pattern + per-region layout. The wizard's BYO Flow B
|
||||
// + the multi-region per-provider rework make this the canonical
|
||||
// shape: a Sovereign is N regions × clusters × node-pools wide.
|
||||
Topology TopologyData `json:"topology"`
|
||||
|
||||
// Storage — Persistent Volume Claims, S3-compatible buckets, and
|
||||
// raw block Volumes attached across the topology. Aggregated here
|
||||
// so the Storage tab renders without a second round-trip.
|
||||
Storage StorageData `json:"storage"`
|
||||
}
|
||||
|
||||
// CloudTenant — one cloud-provider account/project this Sovereign
|
||||
// runs against. The catalyst-api derives this from the deployment
|
||||
// record's Request (e.g. HetznerProjectID) — credentials never flow
|
||||
// into the response.
|
||||
type CloudTenant struct {
|
||||
ID string `json:"id"`
|
||||
Provider string `json:"provider"` // hetzner | oci | aws | ...
|
||||
Name string `json:"name"` // human label
|
||||
|
||||
// ProjectID — opaque cloud-side identifier (e.g. Hetzner project
|
||||
// number). Read-only metadata; never treated as a credential.
|
||||
ProjectID string `json:"projectID,omitempty"`
|
||||
|
||||
// Status mirrors the deployment's overall status — healthy when
|
||||
// the Sovereign is ready, unknown while provisioning, failed on
|
||||
// terminal failure.
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// TopologyData — pattern + regions list. Pattern derives from the
|
||||
// number of regions and HA flag: solo (1 region, 1 cluster), ha-pair
|
||||
// (1 region, HA), multi-region (N>1 regions), air-gap (BYO + isolated).
|
||||
type TopologyData struct {
|
||||
Pattern string `json:"pattern"`
|
||||
Regions []Region `json:"regions"`
|
||||
}
|
||||
|
||||
// Region — one cloud region within the Sovereign's deployment. Carries
|
||||
// the per-region SKU + worker count plus the live cluster + node-pool
|
||||
// + network state read from the informer cache when available.
|
||||
type Region struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Provider string `json:"provider"`
|
||||
ProviderRegion string `json:"providerRegion"` // e.g. fsn1, hel1, ash
|
||||
|
||||
// SkuCP / SkuWorker — Hetzner server type slug (cpx21, cpx41).
|
||||
// Empty when the deployment hasn't reached Validate yet.
|
||||
SkuCP string `json:"skuCP"`
|
||||
SkuWorker string `json:"skuWorker"`
|
||||
|
||||
// WorkerCount — declared worker count for this region. The
|
||||
// frontend renders this on the region card; the live node count
|
||||
// comes from len(Clusters[*].Nodes).
|
||||
WorkerCount int `json:"workerCount"`
|
||||
|
||||
// Status — healthy | degraded | failed | unknown. Pre-Phase-0
|
||||
// deployments emit unknown.
|
||||
Status string `json:"status"`
|
||||
|
||||
Clusters []Cluster `json:"clusters"`
|
||||
Networks []Network `json:"networks"`
|
||||
}
|
||||
|
||||
// Cluster — one Kubernetes cluster within a region. The OWNING
|
||||
// Sovereign always has exactly one host cluster today; future
|
||||
// multi-cluster Sovereigns will surface additional entries here.
|
||||
type Cluster struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Version string `json:"version"`
|
||||
Status string `json:"status"`
|
||||
|
||||
// NodeCount — total nodes (control-plane + workers) across all
|
||||
// node-pools on this cluster. Computed from Nodes when the live
|
||||
// cluster informer cache is populated; falls back to the
|
||||
// declared count from the deployment record otherwise.
|
||||
NodeCount int `json:"nodeCount"`
|
||||
|
||||
VClusters []VCluster `json:"vclusters"`
|
||||
LoadBalancers []LoadBalancer `json:"loadBalancers"`
|
||||
NodePools []NodePool `json:"nodePools"`
|
||||
Nodes []Node `json:"nodes"`
|
||||
}
|
||||
|
||||
// VCluster — a vcluster.io v1alpha1 virtual cluster running on the
|
||||
// host cluster. Used by Catalyst's DMZ / RTZ / MGMT building-block
|
||||
// layout. Populated only when the vcluster operator is installed AND
|
||||
// at least one VCluster CR exists; otherwise the slice is empty.
|
||||
type VCluster struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Namespace string `json:"namespace"`
|
||||
Role string `json:"role"` // dmz | rtz | mgmt | other
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// LoadBalancer — Hetzner cloud LB (or future multi-cloud equivalent)
|
||||
// attached to the cluster. Today a Sovereign has exactly one LB
|
||||
// fronting the ingress controller; future multi-LB topologies surface
|
||||
// here.
|
||||
type LoadBalancer struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
PublicIP string `json:"publicIP"`
|
||||
Ports string `json:"ports"`
|
||||
TargetHealth string `json:"targetHealth"`
|
||||
Region string `json:"region"`
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// NodePool — a logical group of identically-sized worker nodes the
|
||||
// catalyst-environment-controller can scale up/down via the
|
||||
// NodePoolClaim XRC. The Phase-0 OpenTofu module emits one
|
||||
// control-plane pool + one worker pool per region; Day-2 mutations
|
||||
// add additional pools.
|
||||
type NodePool struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Role string `json:"role"` // control-plane | worker
|
||||
SKU string `json:"sku"` // server type slug
|
||||
Region string `json:"region"`
|
||||
DesiredSize int `json:"desiredSize"` // declared size
|
||||
CurrentSize int `json:"currentSize"` // observed from informer
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// Node — one Kubernetes node. Surfaced from the live cluster's
|
||||
// informer cache; pre-cluster deployments synthesise nodes from the
|
||||
// declared topology for the canvas to render.
|
||||
type Node struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
SKU string `json:"sku"`
|
||||
Region string `json:"region"`
|
||||
Role string `json:"role"`
|
||||
IP string `json:"ip"`
|
||||
Status string `json:"status"`
|
||||
NodePoolID string `json:"nodePoolID,omitempty"`
|
||||
}
|
||||
|
||||
// Network — one cloud network / VPC plus its peerings + firewalls.
|
||||
// Multi-region Sovereigns peer their per-region networks through this
|
||||
// shape; the third-sibling Crossplane PeeringClaim Composition writes
|
||||
// the actual peering object.
|
||||
type Network struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
CIDR string `json:"cidr"`
|
||||
Region string `json:"region"`
|
||||
Peerings []Peering `json:"peerings"`
|
||||
Firewall *FirewallRules `json:"firewall,omitempty"`
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// Peering — one VPC-to-VPC peering edge. Status mirrors the cloud
|
||||
// provider's terminal state (active | degraded | failed | unknown).
|
||||
type Peering struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
VPCPair string `json:"vpcPair"`
|
||||
Subnets string `json:"subnets"`
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// FirewallRules — collected ingress/egress rules a cloud firewall
|
||||
// applies to its attached resources. Surfaced as a dedicated child of
|
||||
// Network so the Network tab renders rule chips inline with the VPC
|
||||
// card.
|
||||
type FirewallRules struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Rules []FirewallRule `json:"rules"`
|
||||
}
|
||||
|
||||
// FirewallRule — a single allow/deny rule. The IP-list field is
|
||||
// rendered as a comma-separated string in the UI; a slice would tempt
|
||||
// the React grid to scroll horizontally for /32 enumerations.
|
||||
type FirewallRule struct {
|
||||
ID string `json:"id"`
|
||||
Direction string `json:"direction"` // in | out
|
||||
Protocol string `json:"protocol"` // tcp | udp | icmp
|
||||
Port string `json:"port"` // empty for icmp
|
||||
Sources string `json:"sources"` // CIDR list (CSV)
|
||||
Action string `json:"action"` // accept | drop
|
||||
}
|
||||
|
||||
// StorageData — aggregate storage view across the whole topology.
|
||||
// PVCs come from the live cluster (when reachable); buckets +
|
||||
// volumes come from cloud-provider state via Crossplane managed-
|
||||
// resource status.
|
||||
type StorageData struct {
|
||||
PVCs []PVC `json:"pvcs"`
|
||||
Buckets []Bucket `json:"buckets"`
|
||||
Volumes []Volume `json:"volumes"`
|
||||
}
|
||||
|
||||
// PVC — one Kubernetes PersistentVolumeClaim. Fields mirror the
|
||||
// existing infraPVCItem shape so the UI's PVC card renders unchanged.
|
||||
type PVC struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Namespace string `json:"namespace"`
|
||||
Capacity string `json:"capacity"`
|
||||
Used string `json:"used"`
|
||||
StorageClass string `json:"storageClass"`
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// Bucket — one S3-compatible object bucket (Hetzner Object Storage,
|
||||
// SeaweedFS volume server, etc.).
|
||||
type Bucket struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Endpoint string `json:"endpoint"`
|
||||
Capacity string `json:"capacity"`
|
||||
Used string `json:"used"`
|
||||
RetentionDays string `json:"retentionDays"`
|
||||
}
|
||||
|
||||
// Volume — one cloud block-storage volume (Hetzner Volume).
|
||||
type Volume struct {
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Capacity string `json:"capacity"`
|
||||
Region string `json:"region"`
|
||||
AttachedTo string `json:"attachedTo"`
|
||||
Status string `json:"status"`
|
||||
}
|
||||
|
||||
// MutationResponse — uniform 202 Accepted shape every CRUD endpoint
|
||||
// emits. The frontend keys off jobId to deep-link to the
|
||||
// GitLab-style log viewer in the Jobs surface.
|
||||
type MutationResponse struct {
|
||||
JobID string `json:"jobId"`
|
||||
XRCKind string `json:"xrcKind"`
|
||||
XRCName string `json:"xrcName"`
|
||||
Status string `json:"status"`
|
||||
SubmittedAt time.Time `json:"submittedAt"`
|
||||
|
||||
// Cascade — populated only on DELETE. Lists the child resources
|
||||
// the cascade would affect so the FE confirm dialog can render
|
||||
// "deleting region X will drain Y workloads, remove Z PVCs".
|
||||
Cascade []CascadeImpact `json:"cascade,omitempty"`
|
||||
}
|
||||
|
||||
// CascadeImpact — one row in a delete cascade preview.
|
||||
type CascadeImpact struct {
|
||||
Kind string `json:"kind"` // region | cluster | nodePool | node | pvc | volume | lb | peering
|
||||
ID string `json:"id"`
|
||||
Name string `json:"name"`
|
||||
Note string `json:"note,omitempty"` // operator-readable detail
|
||||
}
|
||||
330
products/catalyst/bootstrap/api/internal/infrastructure/xrc.go
Normal file
330
products/catalyst/bootstrap/api/internal/infrastructure/xrc.go
Normal file
@ -0,0 +1,330 @@
|
||||
// xrc.go — Crossplane Composite Resource Claim writer helpers.
|
||||
//
|
||||
// Per docs/INVIOLABLE-PRINCIPLES.md #3 every Day-2 mutation MUST be
|
||||
// expressed as a Crossplane XRC submission. This file is the single
|
||||
// seam through which the catalyst-api writes those XRCs against the
|
||||
// SOVEREIGN cluster (NOT contabo-mkt).
|
||||
//
|
||||
// The seam takes a dynamic.Interface (the Sovereign's dynamic client,
|
||||
// built from the deployment's persisted kubeconfig) and a typed
|
||||
// XRCSpec; it produces the unstructured object, performs the create
|
||||
// against Crossplane's Composite Resource Claim API, and returns the
|
||||
// XRC's namespace + name + GVK so the handler can surface them in
|
||||
// the 202 response and write the audit-trail Job entry.
|
||||
//
|
||||
// # Why dynamic.Interface, not typed clients
|
||||
//
|
||||
// Crossplane Compositions are author-time; the catalyst-api is
|
||||
// consumer-time. We write claims against API groups (e.g.
|
||||
// `infra.openova.io/v1alpha1`) whose typed Go schemas DON'T live
|
||||
// in this repo — the third-sibling agent owns them in their own
|
||||
// chart. A dynamic client lets us write claims by group/version/kind
|
||||
// without compiling against generated types.
|
||||
//
|
||||
// # When the Composition isn't ready yet
|
||||
//
|
||||
// If the Composition for a given XRC kind doesn't exist on the
|
||||
// Sovereign cluster yet (the third-sibling agent hasn't finished
|
||||
// authoring + applying their chart), the create still succeeds —
|
||||
// Crossplane stores the claim and sits it as Pending. The catalyst-
|
||||
// api emits a Job log line saying "Awaiting Crossplane Composition
|
||||
// for <kind>" and returns the same 202.
|
||||
package infrastructure
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
apierrors "k8s.io/apimachinery/pkg/api/errors"
|
||||
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
|
||||
"k8s.io/apimachinery/pkg/runtime/schema"
|
||||
"k8s.io/client-go/dynamic"
|
||||
)
|
||||
|
||||
// XRCAPIGroup — the canonical Crossplane API group catalyst-api
|
||||
// writes claims under. The third-sibling chart's Compositions match
|
||||
// this group + version + per-kind names. Per docs/INVIOLABLE-PRINCIPLES.md
|
||||
// #4 the group name is centralised here so a future migration to a
|
||||
// different group only changes one line.
|
||||
const (
|
||||
XRCAPIGroup = "infra.openova.io"
|
||||
XRCAPIVersion = "v1alpha1"
|
||||
)
|
||||
|
||||
// XRC kind constants. Each maps to one Composition the third-sibling
|
||||
// agent authors. Keep these in lockstep with the Composition manifest
|
||||
// metadata.name values; mismatches surface as Pending claims with no
|
||||
// reconciliation progress.
|
||||
const (
|
||||
KindRegionClaim = "RegionClaim"
|
||||
KindClusterClaim = "ClusterClaim"
|
||||
KindVClusterClaim = "VClusterClaim"
|
||||
KindNodePoolClaim = "NodePoolClaim"
|
||||
KindLoadBalancerClaim = "LoadBalancerClaim"
|
||||
KindPeeringClaim = "PeeringClaim"
|
||||
KindFirewallRuleClaim = "FirewallRuleClaim"
|
||||
KindNodeActionClaim = "NodeActionClaim"
|
||||
)
|
||||
|
||||
// XRCNamespace — the namespace catalyst-api submits all claims into.
|
||||
// Crossplane Composite Resource Claims are namespace-scoped; the
|
||||
// third-sibling agent's chart provisions ServiceAccount RBAC + the
|
||||
// underlying ProviderConfig in this namespace.
|
||||
const XRCNamespace = "catalyst-day2"
|
||||
|
||||
// LabelDeploymentID + LabelOwner — every claim catalyst-api writes
|
||||
// carries these labels so an operator can trace a claim back to the
|
||||
// deployment + the catalyst-api Pod that submitted it.
|
||||
const (
|
||||
LabelDeploymentID = "catalyst.openova.io/deployment-id"
|
||||
LabelOwner = "catalyst.openova.io/owner"
|
||||
LabelOwnerValue = "catalyst-api"
|
||||
AnnotationAction = "catalyst.openova.io/action"
|
||||
AnnotationDiff = "catalyst.openova.io/diff"
|
||||
)
|
||||
|
||||
// XRCSpec — the typed payload one CRUD endpoint passes to SubmitXRC.
|
||||
// Spec is a free-form map matching the XRD's schema; the helper
|
||||
// stamps apiVersion + kind + metadata. Keeping this loose lets the
|
||||
// CRUD handlers compose the same shape the third-sibling Composition
|
||||
// expects without a per-kind Go type explosion.
|
||||
type XRCSpec struct {
|
||||
Kind string
|
||||
Name string
|
||||
DeploymentID string
|
||||
Action string // human label e.g. "add-region", "remove-pool"
|
||||
Diff string // unified-diff or short ASCII summary
|
||||
|
||||
// Spec — the XRD's spec subtree as a map. The helper marshals it
|
||||
// under .spec on the unstructured object.
|
||||
Spec map[string]any
|
||||
}
|
||||
|
||||
// ErrXRCNameConflict — surfaced when a claim with the same name +
|
||||
// namespace already exists. The CRUD handler maps this onto HTTP 409
|
||||
// so the wizard can surface "this region is already provisioned".
|
||||
var ErrXRCNameConflict = errors.New("infrastructure: xrc name conflict")
|
||||
|
||||
// SubmitXRC writes the XRC to the Sovereign cluster. Returns the
|
||||
// gvr + the unstructured object (with the cluster-stamped UID +
|
||||
// resourceVersion populated) so the handler can include them in the
|
||||
// 202 response. Caller MUST hold no locks.
|
||||
//
|
||||
// The helper is idempotent on retry: a second call with the same
|
||||
// .metadata.name within the same namespace returns ErrXRCNameConflict
|
||||
// — the handler surfaces 409 and instructs the operator to use PATCH
|
||||
// for in-place updates instead.
|
||||
func SubmitXRC(ctx context.Context, client dynamic.Interface, spec XRCSpec) (*unstructured.Unstructured, schema.GroupVersionResource, error) {
|
||||
if client == nil {
|
||||
return nil, schema.GroupVersionResource{}, errors.New("infrastructure: dynamic client is required (sovereign cluster unreachable)")
|
||||
}
|
||||
if strings.TrimSpace(spec.Kind) == "" {
|
||||
return nil, schema.GroupVersionResource{}, errors.New("infrastructure: XRC kind is required")
|
||||
}
|
||||
if strings.TrimSpace(spec.Name) == "" {
|
||||
return nil, schema.GroupVersionResource{}, errors.New("infrastructure: XRC name is required")
|
||||
}
|
||||
|
||||
gvr := gvrForKind(spec.Kind)
|
||||
obj := &unstructured.Unstructured{}
|
||||
obj.SetAPIVersion(XRCAPIGroup + "/" + XRCAPIVersion)
|
||||
obj.SetKind(spec.Kind)
|
||||
obj.SetName(spec.Name)
|
||||
obj.SetNamespace(XRCNamespace)
|
||||
obj.SetLabels(map[string]string{
|
||||
LabelOwner: LabelOwnerValue,
|
||||
LabelDeploymentID: spec.DeploymentID,
|
||||
})
|
||||
annotations := map[string]string{}
|
||||
if spec.Action != "" {
|
||||
annotations[AnnotationAction] = spec.Action
|
||||
}
|
||||
if spec.Diff != "" {
|
||||
annotations[AnnotationDiff] = spec.Diff
|
||||
}
|
||||
if len(annotations) > 0 {
|
||||
obj.SetAnnotations(annotations)
|
||||
}
|
||||
if spec.Spec != nil {
|
||||
// k8s.io/apimachinery's DeepCopyJSONValue only accepts the
|
||||
// JSON-typed scalars (string, bool, float64, int64, nil) and
|
||||
// recursively-typed slices/maps. Go `int` panics — so we
|
||||
// normalise the spec tree before SetNestedMap.
|
||||
_ = unstructured.SetNestedMap(obj.Object, normaliseJSONMap(spec.Spec), "spec")
|
||||
}
|
||||
|
||||
created, err := client.Resource(gvr).Namespace(XRCNamespace).Create(ctx, obj, metav1.CreateOptions{})
|
||||
if err != nil {
|
||||
if apierrors.IsAlreadyExists(err) {
|
||||
return nil, gvr, fmt.Errorf("%w: %s/%s", ErrXRCNameConflict, spec.Kind, spec.Name)
|
||||
}
|
||||
return nil, gvr, fmt.Errorf("infrastructure: create %s/%s: %w", spec.Kind, spec.Name, err)
|
||||
}
|
||||
return created, gvr, nil
|
||||
}
|
||||
|
||||
// DeleteXRC marks the named XRC for deletion. Crossplane's
|
||||
// Composition controller honours .spec.deletionPolicy=Delete and
|
||||
// reaps the underlying cloud resources. Returns ErrXRCNameConflict
|
||||
// when the claim doesn't exist (the operator should refresh the
|
||||
// topology before retrying).
|
||||
func DeleteXRC(ctx context.Context, client dynamic.Interface, kind, name string) (schema.GroupVersionResource, error) {
|
||||
if client == nil {
|
||||
return schema.GroupVersionResource{}, errors.New("infrastructure: dynamic client is required (sovereign cluster unreachable)")
|
||||
}
|
||||
gvr := gvrForKind(kind)
|
||||
policy := metav1.DeletePropagationForeground
|
||||
err := client.Resource(gvr).Namespace(XRCNamespace).Delete(ctx, name, metav1.DeleteOptions{
|
||||
PropagationPolicy: &policy,
|
||||
})
|
||||
if err != nil {
|
||||
if apierrors.IsNotFound(err) {
|
||||
return gvr, fmt.Errorf("%w: %s/%s not found", ErrXRCNameConflict, kind, name)
|
||||
}
|
||||
return gvr, fmt.Errorf("infrastructure: delete %s/%s: %w", kind, name, err)
|
||||
}
|
||||
return gvr, nil
|
||||
}
|
||||
|
||||
// gvrForKind — derives the plural resource segment from the Kind.
|
||||
// All kinds in this surface follow the same suffix pattern: drop
|
||||
// "Claim" → lowercase → pluralise. RegionClaim → regionclaims.
|
||||
// The helper centralises the rule so a future kind that violates it
|
||||
// surfaces here.
|
||||
func gvrForKind(kind string) schema.GroupVersionResource {
|
||||
plural := strings.ToLower(kind)
|
||||
if !strings.HasSuffix(plural, "s") {
|
||||
plural += "s"
|
||||
}
|
||||
return schema.GroupVersionResource{
|
||||
Group: XRCAPIGroup,
|
||||
Version: XRCAPIVersion,
|
||||
Resource: plural,
|
||||
}
|
||||
}
|
||||
|
||||
// XRCName composes the deterministic XRC name catalyst-api submits.
|
||||
// The same shape (deployment-id-prefix + verb + slug) is used for
|
||||
// every claim so the operator can grep across the cluster for a
|
||||
// single deployment's claims.
|
||||
//
|
||||
// E.g. depID="ce476aaf80731a46", verb="region", slug="hel1" →
|
||||
// "ce476aaf-region-hel1"
|
||||
func XRCName(deploymentID, verb, slug string) string {
|
||||
dep := strings.TrimSpace(deploymentID)
|
||||
if len(dep) > 8 {
|
||||
dep = dep[:8]
|
||||
}
|
||||
verb = strings.ToLower(strings.TrimSpace(verb))
|
||||
slug = strings.ToLower(strings.TrimSpace(slug))
|
||||
parts := []string{}
|
||||
if dep != "" {
|
||||
parts = append(parts, dep)
|
||||
}
|
||||
if verb != "" {
|
||||
parts = append(parts, verb)
|
||||
}
|
||||
if slug != "" {
|
||||
parts = append(parts, slug)
|
||||
}
|
||||
out := strings.Join(parts, "-")
|
||||
// Crossplane claim names follow DNS-1123 subdomain rules — replace
|
||||
// any disallowed characters with '-' and clamp at 63 chars.
|
||||
out = sanitizeDNS1123(out)
|
||||
if len(out) > 63 {
|
||||
out = out[:63]
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func sanitizeDNS1123(in string) string {
|
||||
var b strings.Builder
|
||||
for i, r := range in {
|
||||
switch {
|
||||
case r >= 'a' && r <= 'z':
|
||||
b.WriteRune(r)
|
||||
case r >= '0' && r <= '9':
|
||||
b.WriteRune(r)
|
||||
case r == '-' || r == '.':
|
||||
b.WriteRune(r)
|
||||
case r >= 'A' && r <= 'Z':
|
||||
b.WriteRune(r + 32)
|
||||
default:
|
||||
b.WriteRune('-')
|
||||
}
|
||||
_ = i
|
||||
}
|
||||
out := b.String()
|
||||
out = strings.Trim(out, "-.")
|
||||
if out == "" {
|
||||
out = "x"
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// SubmittedAt — UTC instant the helper stamps onto every claim's
|
||||
// metadata.annotations and into the 202 response. Centralised so
|
||||
// tests can inject a fake clock.
|
||||
func SubmittedAt() time.Time {
|
||||
return time.Now().UTC()
|
||||
}
|
||||
|
||||
// normaliseJSONMap walks a map[string]any and converts every leaf
|
||||
// value to a JSON-compatible scalar (string, bool, int64, float64,
|
||||
// nil) plus recursively normalised maps/slices. The k8s.io
|
||||
// apimachinery's DeepCopyJSONValue panics on Go `int` (no JSON
|
||||
// equivalent) — we hit that path through unstructured.SetNestedMap.
|
||||
// Centralising the conversion here keeps the per-handler spec
|
||||
// blocks readable (`map[string]any{"workerCount": body.WorkerCount}`)
|
||||
// while the wire-level shape stays JSON-strict.
|
||||
func normaliseJSONMap(in map[string]any) map[string]any {
|
||||
out := make(map[string]any, len(in))
|
||||
for k, v := range in {
|
||||
out[k] = normaliseJSONValue(v)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
func normaliseJSONValue(v any) any {
|
||||
switch x := v.(type) {
|
||||
case nil, bool, string, int64, float64:
|
||||
return x
|
||||
case int:
|
||||
return int64(x)
|
||||
case int32:
|
||||
return int64(x)
|
||||
case uint:
|
||||
return int64(x)
|
||||
case uint32:
|
||||
return int64(x)
|
||||
case uint64:
|
||||
return int64(x)
|
||||
case float32:
|
||||
return float64(x)
|
||||
case map[string]any:
|
||||
return normaliseJSONMap(x)
|
||||
case []any:
|
||||
out := make([]any, len(x))
|
||||
for i := range x {
|
||||
out[i] = normaliseJSONValue(x[i])
|
||||
}
|
||||
return out
|
||||
case []string:
|
||||
out := make([]any, len(x))
|
||||
for i := range x {
|
||||
out[i] = x[i]
|
||||
}
|
||||
return out
|
||||
default:
|
||||
// Fallback — render unknown types as their fmt.Sprint string
|
||||
// so the XRC carries a valid JSON value and the operator can
|
||||
// see what was attempted in the resulting Pending claim. The
|
||||
// alternative (drop the field) would silently corrupt the
|
||||
// audit trail.
|
||||
return fmt.Sprintf("%v", x)
|
||||
}
|
||||
}
|
||||
267
products/catalyst/bootstrap/api/internal/jobs/mutation_bridge.go
Normal file
267
products/catalyst/bootstrap/api/internal/jobs/mutation_bridge.go
Normal file
@ -0,0 +1,267 @@
|
||||
// mutation_bridge.go — Day-2 mutation Job audit trail.
|
||||
//
|
||||
// Every infrastructure CRUD endpoint on the catalyst-api goes through
|
||||
// this surface BEFORE writing the Crossplane XRC, so an operator
|
||||
// browsing /api/v1/deployments/{id}/jobs sees a complete record of
|
||||
// every Day-2 action with the diff that was applied + the XRC that
|
||||
// implements it.
|
||||
//
|
||||
// # Audit trail shape
|
||||
//
|
||||
// Per mutation, the bridge writes:
|
||||
//
|
||||
// - One Job (jobName="mutation-<verb>-<kind>", batchId="day-2-mutations")
|
||||
// with deterministic id JobID(deploymentID, jobName).
|
||||
// - One Execution scoped to the same Job, in `running` state.
|
||||
// - One INFO LogLine "[mutation-request] action=<action> diff=...".
|
||||
//
|
||||
// After the XRC is submitted by the caller, AppendXRCSubmittedLog
|
||||
// stamps a follow-up INFO LogLine "[xrc-submitted] kind=<kind>
|
||||
// name=<name>" — and Crossplane's Composition controller, when
|
||||
// reconciling the claim, appends further LogLines via the existing
|
||||
// helmwatch bridge path (for component-typed claims like Cluster /
|
||||
// VCluster Claims) or via per-claim status watchers (TBD by the
|
||||
// third-sibling chart's audit shape).
|
||||
//
|
||||
// # Why a separate batch
|
||||
//
|
||||
// The bootstrap-kit Phase-1 install Jobs go into batchId="bootstrap-kit".
|
||||
// Day-2 mutations go into batchId="day-2-mutations" so the FE Jobs
|
||||
// surface can render them as a separate column. The same Bridge
|
||||
// instance handles both via UpsertJob's batch field.
|
||||
package jobs
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"strings"
|
||||
"time"
|
||||
)
|
||||
|
||||
// BatchDay2Mutations — every mutation Job lands in this batch so the
|
||||
// FE can group them separately from the Phase-1 install Jobs.
|
||||
const BatchDay2Mutations = "day-2-mutations"
|
||||
|
||||
// MutationJobNamePrefix — every mutation Job's name starts with this.
|
||||
// JobName format: "mutation-<verb>-<kind>" (e.g.
|
||||
// "mutation-add-region"). The verb is operator-supplied; the kind is
|
||||
// the XRC kind (RegionClaim → "region").
|
||||
const MutationJobNamePrefix = "mutation-"
|
||||
|
||||
// MutationRecord — the typed payload one CRUD endpoint passes to
|
||||
// RegisterMutationJob. Free-form Diff is rendered verbatim into the
|
||||
// first LogLine; a unified-diff format is the convention.
|
||||
type MutationRecord struct {
|
||||
// Verb — short action label (add | remove | update | scale |
|
||||
// cordon | drain | replace). Used as the second segment of the
|
||||
// JobName.
|
||||
Verb string
|
||||
|
||||
// Kind — XRC kind without the "Claim" suffix, lowercase. E.g.
|
||||
// RegionClaim → "region", NodePoolClaim → "node-pool".
|
||||
Kind string
|
||||
|
||||
// Slug — short identifier of the target resource (region id,
|
||||
// pool name, node id). Used to disambiguate concurrent mutations
|
||||
// on the same kind.
|
||||
Slug string
|
||||
|
||||
// Action — full operator-readable action string for the log line
|
||||
// (e.g. "add-region region=hel1 sku=cpx32 workers=2"). Persisted
|
||||
// verbatim into the LogLine message.
|
||||
Action string
|
||||
|
||||
// Diff — the desired change, in unified-diff or compact ASCII
|
||||
// form. Rendered into the first LogLine and onto the XRC's
|
||||
// catalyst.openova.io/diff annotation by the caller.
|
||||
Diff string
|
||||
|
||||
// XRCKind — the Crossplane Composite Resource Claim kind the
|
||||
// CRUD handler will write next (e.g. "RegionClaim"). Stamped
|
||||
// into a second LogLine via AppendXRCSubmittedLog after the
|
||||
// create() call.
|
||||
XRCKind string
|
||||
|
||||
// At — wall-clock instant the mutation request landed. Defaults
|
||||
// to time.Now() inside the helper.
|
||||
At time.Time
|
||||
}
|
||||
|
||||
// MutationResult — what RegisterMutationJob returns to the caller.
|
||||
// The CRUD handler funnels these into the 202 response so the FE
|
||||
// can deep-link to the GitLab-style log viewer.
|
||||
type MutationResult struct {
|
||||
JobID string
|
||||
JobName string
|
||||
ExecutionID string
|
||||
}
|
||||
|
||||
// ErrInvalidMutation — returned when the supplied MutationRecord is
|
||||
// missing required fields. CRUD handlers map this onto HTTP 500
|
||||
// (this would be a programmer error since the wizard shape is
|
||||
// validated before the handler reaches RegisterMutationJob).
|
||||
var ErrInvalidMutation = errors.New("jobs: invalid mutation record")
|
||||
|
||||
// RegisterMutationJob writes the audit-trail Job + Execution +
|
||||
// initial LogLine for one Day-2 mutation. Caller MUST call this
|
||||
// BEFORE submitting the XRC; the handler then calls
|
||||
// AppendXRCSubmittedLog with the create() result so the Job's log
|
||||
// trail captures both the request and the submission.
|
||||
//
|
||||
// The store-level writes are serialised under Store.mu; the bridge's
|
||||
// own state (b.activeExecID + b.lastState maps) is taken under b.mu
|
||||
// so concurrent mutation registrations on the same deployment can't
|
||||
// tear the cursor.
|
||||
func (b *Bridge) RegisterMutationJob(rec MutationRecord) (MutationResult, error) {
|
||||
if b == nil {
|
||||
return MutationResult{}, errors.New("jobs: bridge is nil")
|
||||
}
|
||||
if strings.TrimSpace(rec.Verb) == "" {
|
||||
return MutationResult{}, fmt.Errorf("%w: verb is required", ErrInvalidMutation)
|
||||
}
|
||||
if strings.TrimSpace(rec.Kind) == "" {
|
||||
return MutationResult{}, fmt.Errorf("%w: kind is required", ErrInvalidMutation)
|
||||
}
|
||||
|
||||
at := rec.At
|
||||
if at.IsZero() {
|
||||
at = time.Now().UTC()
|
||||
} else {
|
||||
at = at.UTC()
|
||||
}
|
||||
|
||||
jobName := mutationJobName(rec.Verb, rec.Kind, rec.Slug)
|
||||
jobID := JobID(b.deploymentID, jobName)
|
||||
|
||||
// Upsert the Job in `running` so the FE table renders the row
|
||||
// with a spinner the moment the API returns 202. We don't enter
|
||||
// `pending` here because the catalyst-api side has already
|
||||
// committed to writing the XRC — pending would be misleading.
|
||||
job := Job{
|
||||
DeploymentID: b.deploymentID,
|
||||
JobName: jobName,
|
||||
AppID: rec.Kind,
|
||||
BatchID: BatchDay2Mutations,
|
||||
DependsOn: []string{},
|
||||
Status: StatusRunning,
|
||||
}
|
||||
if err := b.store.UpsertJob(job); err != nil {
|
||||
return MutationResult{}, fmt.Errorf("jobs: upsert mutation job: %w", err)
|
||||
}
|
||||
|
||||
// Allocate the Execution. StartExecution stamps the Job's
|
||||
// LatestExecutionID + StartedAt; we don't have to UpsertJob
|
||||
// again post-allocation.
|
||||
exec, err := b.store.StartExecution(b.deploymentID, jobName, at)
|
||||
if err != nil {
|
||||
return MutationResult{}, fmt.Errorf("jobs: start mutation execution: %w", err)
|
||||
}
|
||||
|
||||
// Take b.mu to record the active execution cursor.
|
||||
b.mu.Lock()
|
||||
b.activeExecID[jobName] = exec.ID
|
||||
b.lastState[jobName] = StatusRunning
|
||||
b.mu.Unlock()
|
||||
|
||||
// Initial LogLine — "[mutation-request] ...".
|
||||
msg := "[mutation-request] " + strings.TrimSpace(rec.Action)
|
||||
if rec.Diff != "" {
|
||||
msg += " diff=" + strings.ReplaceAll(rec.Diff, "\n", " ")
|
||||
}
|
||||
if err := b.store.AppendLogLines(b.deploymentID, exec.ID, []LogLine{{
|
||||
Timestamp: at,
|
||||
Level: LevelInfo,
|
||||
Message: strings.TrimSpace(msg),
|
||||
}}); err != nil {
|
||||
return MutationResult{}, fmt.Errorf("jobs: append mutation request log: %w", err)
|
||||
}
|
||||
|
||||
return MutationResult{
|
||||
JobID: jobID,
|
||||
JobName: jobName,
|
||||
ExecutionID: exec.ID,
|
||||
}, nil
|
||||
}
|
||||
|
||||
// AppendXRCSubmittedLog appends an INFO LogLine recording the XRC
|
||||
// the catalyst-api just submitted. Called after SubmitXRC succeeds.
|
||||
// The CRUD handler passes the same MutationResult RegisterMutationJob
|
||||
// returned so the helper finds the right Execution.
|
||||
//
|
||||
// When the XRC create errors (e.g. AlreadyExists), the handler
|
||||
// instead calls FinishMutationJob with status=failed; this helper
|
||||
// is for the success path.
|
||||
func (b *Bridge) AppendXRCSubmittedLog(res MutationResult, xrcKind, xrcName, note string) error {
|
||||
if b == nil {
|
||||
return errors.New("jobs: bridge is nil")
|
||||
}
|
||||
if strings.TrimSpace(res.ExecutionID) == "" {
|
||||
return errors.New("jobs: AppendXRCSubmittedLog: ExecutionID is required")
|
||||
}
|
||||
msg := "[xrc-submitted] kind=" + xrcKind + " name=" + xrcName
|
||||
if note != "" {
|
||||
msg += " note=" + note
|
||||
}
|
||||
return b.store.AppendLogLines(b.deploymentID, res.ExecutionID, []LogLine{{
|
||||
Timestamp: time.Now().UTC(),
|
||||
Level: LevelInfo,
|
||||
Message: msg,
|
||||
}})
|
||||
}
|
||||
|
||||
// FinishMutationJob flips the mutation Job into a terminal state
|
||||
// AFTER the XRC submission completes. status MUST be StatusSucceeded
|
||||
// or StatusFailed; the handler decides which based on the
|
||||
// SubmitXRC outcome.
|
||||
//
|
||||
// For a successful submission the Job is technically still "running"
|
||||
// from Crossplane's POV (the Composition is reconciling), but the
|
||||
// API-side audit job is done — Crossplane's own status feed is the
|
||||
// continuing log surface. Treating the API-side job as Succeeded on
|
||||
// "submission accepted" matches the FE expectation: the row turns
|
||||
// green when 202 lands, and Crossplane's downstream reconciliation
|
||||
// surfaces as additional LogLines on the same Execution OR via the
|
||||
// helmwatch bridge (for component claims).
|
||||
func (b *Bridge) FinishMutationJob(res MutationResult, status string, errMsg string) error {
|
||||
if b == nil {
|
||||
return errors.New("jobs: bridge is nil")
|
||||
}
|
||||
if strings.TrimSpace(res.ExecutionID) == "" {
|
||||
return errors.New("jobs: FinishMutationJob: ExecutionID is required")
|
||||
}
|
||||
if status == "" {
|
||||
status = StatusSucceeded
|
||||
}
|
||||
if !IsTerminal(status) {
|
||||
return fmt.Errorf("jobs: FinishMutationJob: status must be terminal, got %q", status)
|
||||
}
|
||||
if errMsg != "" {
|
||||
_ = b.store.AppendLogLines(b.deploymentID, res.ExecutionID, []LogLine{{
|
||||
Timestamp: time.Now().UTC(),
|
||||
Level: LevelError,
|
||||
Message: "[xrc-submission-failed] " + errMsg,
|
||||
}})
|
||||
}
|
||||
if err := b.store.FinishExecution(b.deploymentID, res.ExecutionID, status, time.Now().UTC()); err != nil {
|
||||
return err
|
||||
}
|
||||
b.mu.Lock()
|
||||
delete(b.activeExecID, res.JobName)
|
||||
delete(b.lastState, res.JobName)
|
||||
b.mu.Unlock()
|
||||
return nil
|
||||
}
|
||||
|
||||
// mutationJobName composes the JobName from the request fields. The
|
||||
// shape is "mutation-<verb>-<kind>[-<slug>]" so concurrent mutations
|
||||
// on the same kind don't collide on a single Job row.
|
||||
func mutationJobName(verb, kind, slug string) string {
|
||||
v := strings.ToLower(strings.TrimSpace(verb))
|
||||
k := strings.ToLower(strings.TrimSpace(kind))
|
||||
s := strings.ToLower(strings.TrimSpace(slug))
|
||||
parts := []string{MutationJobNamePrefix + v, k}
|
||||
if s != "" {
|
||||
parts = append(parts, s)
|
||||
}
|
||||
return strings.Join(parts, "-")
|
||||
}
|
||||
Loading…
Reference in New Issue
Block a user