openova/platform/openclaw
e3mrah d6dedb1ecd
fix(bp-openclaw): use placeholder defaults so blueprint-release smoke render passes (#803) (#813)
The blueprint-release CI workflow runs `helm template <chart>` with
default values as a smoke gate (.github/workflows/blueprint-release.yaml
SMOKE step). The original chart shipped empty-string defaults for every
required value (keycloak.realmURL, tenant.namespace, etc.) and used
`required` / `fail` to abort render — which is correct fail-fast
behaviour for real installs but wrongly fails CI's default-values
smoke step. Result: bp-openclaw 0.1.0 never published to GHCR (run
25335221500 fail).

Match the bp-self-sovereign-cutover pattern (PR #791): provide
placeholder defaults that let smoke render produce valid YAML, gated
behind a new `assertNoPlaceholders` toggle that per-cluster Flux
overlays MUST set to `true`. With the toggle ON, _helpers.tpl ::
assertNoPlaceholders fails render with a clear message identifying any
placeholder still in place.

Changes:
- values.yaml: add placeholder defaults for keycloak.realmURL,
  keycloak.clientSecretName, newapi.baseURL, tenant.namespace,
  ingress.host, controller.image.tag, perUserPod.image.tag.
  Add `assertNoPlaceholders: false` flag (overlays set true).
- _helpers.tpl: replace assertRequired with assertNoPlaceholders —
  same intent, runs only when the toggle is on, so smoke render passes
  while real installs still get fail-fast on bad overlays.
- serviceaccount.yaml: invoke assertNoPlaceholders instead of assertRequired.
- controller-deployment.yaml + controller-ingress.yaml: drop the
  `required` calls (defaults are now valid bytes; the
  assertNoPlaceholders helper enforces real values at install time).
- tests/render-toggles.sh: rewrite Case 1 (now expects success) and
  Case 2 (asserts assertNoPlaceholders=true fails on placeholders) +
  Case 2b (assertNoPlaceholders=true with real values succeeds).
  All 7 gates pass locally.

Output (post-merge): chart published to
oci://ghcr.io/openova-io/bp-openclaw:0.1.0.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:17:43 +04:00
..
chart fix(bp-openclaw): use placeholder defaults so blueprint-release smoke render passes (#803) (#813) 2026-05-04 22:17:43 +04:00
runtime feat(bp-openclaw): workspace controller + per-user pod chart (#803) (#810) 2026-05-04 22:10:24 +04:00
blueprint.yaml feat(bp-openclaw): workspace controller + per-user pod chart (#803) (#810) 2026-05-04 22:10:24 +04:00
README.md feat(bp-openclaw): workspace controller + per-user pod chart (#803) (#810) 2026-05-04 22:10:24 +04:00

bp-openclaw — workspace controller + per-user pod

Catalyst Blueprint for OpenClaw: a multi-tenant workspace controller deployment plus per-user runtime pods spawned on demand. Implements locked decision [A] of epic #795 and consumes the per-user newapi-key-{user-uuid} Secrets rendered by the unified-rbac user-create hook (ADR-0003).


Architecture

                       openclaw.<sme-domain>            (Traefik ingress + cert-manager TLS)
                              │
                              ▼
              ┌──────────────────────────────────┐
              │  Controller Deployment (HA-able) │
              │  - validates SME Keycloak JWT    │
              │  - spawns / proxies / reaps      │
              │    per-user pods                 │
              └─────┬───────────────────┬────────┘
                    │  K8s API           │  WebSocket / SSE proxy
                    ▼                    ▼
        ┌─────────────────────┐    ┌──────────────────────┐
        │  newapi-key-{uuid}  │    │  Per-user runtime    │
        │  Secret (per ADR-3) │    │  Pod (one per active │
        │  - api-key          │◀───│  session). Reads     │
        │  - base-url         │    │  NEWAPI_BASE_URL +   │
        └─────────────────────┘    │  NEWAPI_KEY env. NO  │
                                   │  Keycloak code, NO   │
                                   │  key-mgmt code.      │
                                   └──────────────────────┘

Why this shape (and not the rejected alternative). A "shared OpenClaw service forwarding the SME-vcluster JWT to NewAPI" was explicitly rejected in #795: it would force NewAPI to trust a Keycloak realm it has no ownership of (cross-realm OIDC trust = identity sprawl). The workspace- controller pattern keeps NewAPI talking only to its own keys (one per SME end-user), and the controller is the only component that ever sees a JWT from the SME's Keycloak realm.

This pattern generalises to bp-opencode / bp-aider / bp-cursor-server later — the chart's structure (controller + per-user pod + identity- blind runtime image) is intentionally reusable.


What this chart contains

File Purpose
Chart.yaml Metadata; catalyst.openova.io/no-upstream: "true" (no upstream Helm chart)
values.yaml All operator-tunable values; assertions in _helpers.tpl fail render with helpful messages when required values are missing
templates/_helpers.tpl Naming / labels / required-value assertions
templates/serviceaccount.yaml Controller ServiceAccount (release ns)
templates/controller-rbac.yaml Namespaced Role + RoleBinding in tenant ns. create verbs split into separate rules WITHOUT resourceNames per feedback_rbac_create_no_resourcenames.md
templates/controller-deployment.yaml Multi-tenant controller pod
templates/controller-service.yaml ClusterIP Service for the controller
templates/controller-ingress.yaml Public hostname openclaw.<sme-domain> with cert-manager auto-issue
templates/per-user-pod-template.yaml ConfigMap holding the pod-spec the controller renders per session
templates/networkpolicy.yaml Controller NetworkPolicy. The per-user pod's NetworkPolicy is rendered by the controller at session-start (see "Per-user pod NetworkPolicy" below)
runtime/ Source for the per-user runtime container image (separate OCI artifact, built by .github/workflows/openclaw-runtime.yaml)
tests/render-toggles.sh Helm-template integration test exercised by the blueprint-release CI workflow

Required overlay values

The chart fails to render if any of these are unset (see _helpers.tpl :: assertRequired):

Value Example
keycloak.realmURL https://keycloak.acme.<otech-fqdn>/realms/acme
keycloak.clientSecretName openclaw-oidc (ExternalSecret with key OIDC_CLIENT_SECRET)
tenant.namespace sme-acme
newapi.baseURL https://newapi.<otech-fqdn>
controller.image.tag SHA-pinned tag (Inviolable Principle 4)
perUserPod.image.tag SHA-pinned tag (Inviolable Principle 4)
ingress.host openclaw.acme.<otech-fqdn>

Runtime image contract

Per locked decision [A] of #795, the per-user runtime image reads only two env vars:

Env Source
NEWAPI_BASE_URL secretKeyRef: name=newapi-key-{uuid}, key=base-url
NEWAPI_KEY secretKeyRef: name=newapi-key-{uuid}, key=api-key

It carries no Keycloak code, no key-management code, no knowledge of the SME tenant model. Identity-blind by construction.

The runtime/ directory in this Blueprint ships a minimal reference implementation that satisfies the contract: a Go binary exposing an HTTP /healthz and a /v1/chat/completions proxy that forwards to NewAPI with the injected key. Operators may override perUserPod.image.repository to point at any other image satisfying the same env-vars contract — e.g. a fork of upstream OpenClaw, an OpenAI-compatible coding-CLI image, etc.

Why a stub instead of the upstream OpenClaw

Upstream OpenClaw (https://openclaw.ai) targets messaging-platform integration (WhatsApp / Telegram / Slack) — a different shape from the "per-user agentic workspace" that #795 calls for. There's no upstream Helm chart, and the upstream container expects a wholly different configuration surface (no NEWAPI_BASE_URL env). Rather than fork the upstream and graft a NewAPI driver onto it (significant work, owned by a different ticket), this Blueprint ships a contract-minimal runtime: identity-blind, two env vars, OpenAI-compatible proxy. Future work can swap the runtime image without changing this chart.


RBAC posture

The controller's Role lives in the tenant namespace (where the per-user pods and Secrets live), not the release namespace. The RoleBinding subject is the controller's ServiceAccount in the release namespace.

create verbs are split into their own rules with no resourceNames. This is mandatory: the K8s authorizer rejects create combined with resourceNames (you can't constrain a not-yet-existing resource by name). Label-based ownership (catalyst.openova.io/openclaw-user) is enforced at the controller, not in RBAC. See feedback_rbac_create_no_resourcenames.md for the full incident report.


Per-user pod NetworkPolicy

This chart's networkpolicy.yaml covers the controller pod only. Each per-user pod gets its own NetworkPolicy applied by the controller at session-start, restricting egress to:

  • NewAPI (operator's customer-facing hostname or in-cluster Service)
  • DNS (kube-system :53)

The per-user pod NetworkPolicy is rendered by the controller from a template baked into its container image; it cannot be a static chart template because the egress target (specifically the NewAPI hostname) is read from the per-user newapi-key-{uuid} Secret, which doesn't exist at chart-render time.


Build + publish

The chart is built and published by the existing event-driven CI workflow .github/workflows/blueprint-release.yaml whenever platform/openclaw/chart/** changes on main. Output:

oci://ghcr.io/openova-io/bp-openclaw:0.1.0

The runtime image is built by a sister workflow, .github/workflows/openclaw-runtime.yaml, on push to platform/openclaw/runtime/**. Output:

ghcr.io/openova-io/openova/openclaw-runtime:<sha>

Per Inviolable Principle 1, both workflows are event-driven; neither uses schedule: cron. workflow_dispatch is provided for re-runs only.


  • Epic #795 — SME-tenant turnkey experience
  • ADR-0003 — RBAC ↔ NewAPI user-create hook
  • bp-newapi (platform/newapi/) — Sovereign-level metered LLM gateway
  • bp-keycloak (platform/keycloak/) — SME-vcluster realm