feat(openova-flow): server (HTTP+SSE event router) + flux adapter (K8s informer sidecar) (#1390 )

Agent #2 of 3 for OpenovaFlow. Ships the Go backend independently of
Agent #1's TS packages (@openova/flow-core + @openova/flow-canvas);
the FlowMessage JSON contract is locked between agents.

Two Go modules (separate go.mod each so the dep graphs stay decoupled):

- products/openova-flow/server/ — stateless HTTP+SSE event router.
  Map<flowId, RingBuffer<FlowMessage>>, in-memory, no DB. Endpoints:
  POST /v1/flows/{flowId}/events, GET /v1/flows/{flowId}/snapshot,
  GET /v1/flows/{flowId}/stream (SSE with 15s heartbeats + Last-Event-ID
  seq stamping), DELETE /v1/flows/{flowId}, GET /healthz, /readyz.
  Zero external Go deps (stdlib net/http). Ring cap default 4096
  (env-overridable). Locked schema validation rejects unknown envelope
  variants with 400.

- products/openova-flow/adapter-flux/ — DaemonSet sidecar that watches
  helm.toolkit.fluxcd.io/v2.HelmRelease + HelmChart CRs via
  client-go's dynamicinformer.NewFilteredDynamicSharedInformerFactory
  (canonical seam: products/catalyst/bootstrap/api/internal/k8scache/factory.go),
  maps each event to FlowMessage via a pure-transform mapper, POSTs to
  the configured openova-flow-server with exponential-backoff retry.
  Status mapping: Ready=True → succeeded, InstallFailed/UpgradeFailed/
  RetriesExhausted → failed, Progressing/Unknown/other-False → running,
  no Ready yet → pending. FlowNode.id format "{REGION_KEY}/{hrName}"
  so multi-region renders correctly. Region-aware: synthetic region
  parent FlowNode emitted on bootstrap; dependsOn entries fan-out to
  finish-to-start relationships.

Two wrapper charts under platform/openova-flow-{server,emitter}/chart/
(canonical seam: platform/qa-app/chart/ for the simple
Deployment+Service+SA shape; platform/k8s-ws-proxy/chart/ for the
DaemonSet+ClusterRole+ClusterRoleBinding shape). MIRROR-EVERYTHING:
image refs go through harbor.openova.io/proxy-ghcr/openova-io/...
Image tag + required runtime config fail-fast at chart render via
_helpers.tpl so silent ImagePullBackOff / boot crash is impossible.

Two bootstrap-kit HRs added (slots 56 + 57):
- 56-bp-openova-flow-server (dependsOn: bp-cilium, bp-cert-manager) —
  installs on primary cluster only; Cilium Gateway HTTPRoute at
  openova-flow.<sovereignFQDN> for cross-cluster ingest.
- 57-bp-openova-flow-emitter (dependsOn: bp-flux) — DaemonSet, runs
  on every cluster (mother + Sovereign + every secondary region).

scripts/expected-bootstrap-deps.yaml updated; check-bootstrap-deps.sh
audit passes (drift=0, cycles=0).

Tests (all green):
- server contract_test.go — every FlowMessage variant round-trips JSON,
  unknown/malformed variants reject. Cross-flow Triggerer/ToFlowID
  preserved.
- server server_test.go — full HTTP surface, including SSE replay+tail
  with a real httptest.Server.
- adapter mapper_test.go — every HelmRelease.status.conditions[Ready]
  transition + multi-dependsOn fan-out + family-label/heuristic + region
  fallback.

Verification done locally:
- (cd products/openova-flow/server && go build ./... && go test ./...) — PASS
- (cd products/openova-flow/adapter-flux && go build ./... && go test ./...) — PASS
- helm template platform/openova-flow-server/chart/ — renders cleanly
- helm template platform/openova-flow-emitter/chart/ — renders cleanly
- bash scripts/check-bootstrap-deps.sh — PASS (drift=0)

Agent #3 follow-ups (called out in slot 57's HelmRelease comments):
- Thread SOVEREIGN_DEPLOYMENT_ID + REGION_KEY into the
  postBuild.substitute env in infra/hetzner/cloudinit-control-plane.tftpl
  so the emitter's flowId/regionKey become per-deployment + per-region
  automatically. Today the slot uses SOVEREIGN_FQDN as the flowId
  fallback and "primary" as the regionKey default; per-Sovereign overlays
  can override pre-Agent-#3.
- catalyst-api proxy at /sovereign/api/v1/flows/{id}/stream so the
  Sovereign Console canvas hits a single in-tree origin.

Co-authored-by: e3mrah <1234567+e3mrah@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-11 15:36:54 +04:00

3.8 KiB

Raw Blame History

openova-flow-adapter-flux

Kubernetes informer sidecar that watches Flux HelmRelease + HelmChart CRs in the local cluster, maps each state change into a FlowMessage envelope, and POSTs to a configured openova-flow-server.

This is Agent #2's region-aware emitter — runs as a DaemonSet on every cluster (mother + every Sovereign + every secondary). The receiving server is on the primary cluster only; cross-cluster reachability comes from Cilium Gateway over public HTTPS (no NetBird required for v1).

Status mapping

Flux's HelmRelease.status.conditions[type=Ready] folds into the FlowNode.status palette:

Ready.status	Ready.reason	FlowNode.status
True	any	`succeeded`
False	InstallFailed	`failed`
False	UpgradeFailed	`failed`
False	RetriesExhausted	`failed`
False	Progressing	`running`
False	(other)	`running`
Unknown	any	`running`
(no Ready)	—	`pending`

Identity & topology

FlowNode.id = {REGION_KEY}/{hr.metadata.name} — region-aware so multi-region renders correctly when N adapter sidecars (one per cluster) all post to the same flowId.
FlowNode.family — reads metadata.labels[catalyst.openova.io/family] when present; otherwise heuristic <name> with bp- prefix stripped (so bp-cert-manager → cert-manager).
FlowNode.region = REGION_KEY env.
Synthetic region node — on startup the adapter emits a FlowNode whose ID == REGION_KEY (label "fsn1", "hel1", etc.) so the canvas has a stable container parent. Each HR then emits a contains relationship FROM the region node TO itself.
DependsOn relationships — one per entry in hr.spec.dependsOn[], type finish-to-start, condition on-success.

Env

Name	Default	Required	Purpose
`FLOW_SERVER_URL`	—	yes	Base URL of openova-flow-server.
`FLOW_ID`	—	yes	Runtime FlowInstance id this adapter binds to.
`REGION_KEY`	—	yes	Region id ("fsn1", "hel1", etc.).
`NAMESPACE_FILTER`	`flux-system`	no	Informer namespace scope.
`EMIT_INTERVAL`	`200ms`	no	Min delay between (id,status) duplicates.
`POST_TIMEOUT`	`10s`	no	Per-POST wall clock cap.
`HEALTH_LISTEN_ADDR`	`:8081`	no	Liveness/readiness listener.

Per docs/INVIOLABLE-PRINCIPLES.md #4 every operational knob is env-driven; per #4a the container image is built by GitHub Actions and pulled through harbor.

Build

cd products/openova-flow/adapter-flux
go build ./...
go test ./...

CI image: harbor.openova.io/proxy-ghcr/openova-io/openova/openova-flow-adapter-flux:<sha>.

Canonical patterns this code follows

Informer factory — products/catalyst/bootstrap/api/internal/k8scache/factory.go is the seam for dynamicinformer.NewDynamicSharedInformerFactory + per-resource event handler dispatch.
Chart layout — platform/qa-app/chart/ is the seam for a simple in-house Deployment+Service+ServiceAccount chart (the mirror chart for openova-flow-server). The DaemonSet+ClusterRole+ClusterRoleBinding shape (mirror chart for this adapter) follows platform/k8s-ws-proxy/chart/.

Tests

test/mapper_test.go — every Ready condition state + every dependsOn fan-out + every family-label / heuristic path is asserted.

3.8 KiB Raw Blame History