openova/platform/openclaw
e3mrah 61c8d77b58
feat(bp-openclaw): per-tenant Keycloak SSO + NewAPI as OpenAI-compatible LLM gateway (#915) (#917)
Wire bp-openclaw to the per-tenant Keycloak realm (OIDC SSO) and the
per-tenant NewAPI (OpenAI-compatible LLM endpoint, NOT direct OpenAI),
delivering C3 of umbrella epic #915.

Chart changes (bp-openclaw 0.1.0 → 0.2.0):
- Add canonical `oidc.{issuerURL,clientId,clientSecret.{name,key}}` block.
- Add canonical `llm.{baseURL,apiKey.{name,key},defaultModel}` block.
- Controller Deployment now emits OIDC_*, LLM_*, OPENAI_API_{BASE,KEY},
  LLM_DEFAULT_MODEL envs (legacy KEYCLOAK_*/NEWAPI_BASE_URL_DEFAULT
  retained for back-compat with current controller image).
- Per-user pods carry OPENAI_API_BASE / OPENAI_API_KEY / LLM_DEFAULT_MODEL
  alongside the identity-blind NEWAPI_BASE_URL / NEWAPI_KEY (ADR-0003
  §3.3 unchanged).
- Legacy `keycloak.*` / `newapi.*` keys remain accepted as fallbacks;
  helpers prefer canonical blocks but fall back to the legacy alias when
  the canonical block is unset (or still at placeholder).
- assertNoPlaceholders guard updated to check resolved canonical values.
- render-toggles.sh smoke test extended: asserts both canonical and
  legacy code-paths render and that all expected envs reach the
  rendered Deployment.

Orchestrator changes (catalyst-api smeTenantBPOpenClaw template):
- Emit per-tenant `oidc.issuerURL` = https://keycloak.<sub>.<parent>/realms/sme-<sub>
- Emit per-tenant `oidc.clientId` = openclaw, secret from
  openclaw-oidc-client-secret/OIDC_CLIENT_SECRET (rendered by
  bp-keycloak's post-install hook).
- Emit per-tenant `llm.baseURL` = https://api.<sub>.<parent>/v1 (alice's
  own NewAPI ingress, NOT the otech-wide newapi.<otech-fqdn>); apiKey
  from openclaw-newapi-controller-token/NEWAPI_KEY.
- Emit `llm.defaultModel: qwen3.6` — NewAPI uses this to select the
  backing channel; C4 of #915 wires Qwen3.6@BankDhofar at tenant-create.
- Legacy keycloak/newapi blocks still emitted for back-compat with
  bp-openclaw < 0.2.0.

Tests:
- New TestRenderSMETenantOverlay_OpenClawOIDCAndLLMBlocks asserts the
  rendered HelmRelease contains the canonical oidc + llm blocks with
  per-tenant values, and that llm.baseURL is the per-tenant
  api.<sub>.<parent>/v1 (NOT the otech-wide newapi).
- bp-openclaw render-toggles.sh extended (Case 2b/2c).

Co-authored-by: alierenbaysal <alierenbaysal@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:26:59 +04:00
..
chart feat(bp-openclaw): per-tenant Keycloak SSO + NewAPI as OpenAI-compatible LLM gateway (#915) (#917) 2026-05-05 13:26:59 +04:00
runtime feat(bp-openclaw): workspace controller + per-user pod chart (#803) (#810) 2026-05-04 22:10:24 +04:00
blueprint.yaml feat(bp-openclaw): workspace controller + per-user pod chart (#803) (#810) 2026-05-04 22:10:24 +04:00
README.md feat(bp-openclaw): per-tenant Keycloak SSO + NewAPI as OpenAI-compatible LLM gateway (#915) (#917) 2026-05-05 13:26:59 +04:00

bp-openclaw — workspace controller + per-user pod

Catalyst Blueprint for OpenClaw: a multi-tenant workspace controller deployment plus per-user runtime pods spawned on demand. Implements locked decision [A] of epic #795 and consumes the per-user newapi-key-{user-uuid} Secrets rendered by the unified-rbac user-create hook (ADR-0003).


Architecture

                       openclaw.<sme-domain>            (Traefik ingress + cert-manager TLS)
                              │
                              ▼
              ┌──────────────────────────────────┐
              │  Controller Deployment (HA-able) │
              │  - validates SME Keycloak JWT    │
              │  - spawns / proxies / reaps      │
              │    per-user pods                 │
              └─────┬───────────────────┬────────┘
                    │  K8s API           │  WebSocket / SSE proxy
                    ▼                    ▼
        ┌─────────────────────┐    ┌──────────────────────┐
        │  newapi-key-{uuid}  │    │  Per-user runtime    │
        │  Secret (per ADR-3) │    │  Pod (one per active │
        │  - api-key          │◀───│  session). Reads     │
        │  - base-url         │    │  NEWAPI_BASE_URL +   │
        └─────────────────────┘    │  NEWAPI_KEY env. NO  │
                                   │  Keycloak code, NO   │
                                   │  key-mgmt code.      │
                                   └──────────────────────┘

Why this shape (and not the rejected alternative). A "shared OpenClaw service forwarding the SME-vcluster JWT to NewAPI" was explicitly rejected in #795: it would force NewAPI to trust a Keycloak realm it has no ownership of (cross-realm OIDC trust = identity sprawl). The workspace- controller pattern keeps NewAPI talking only to its own keys (one per SME end-user), and the controller is the only component that ever sees a JWT from the SME's Keycloak realm.

This pattern generalises to bp-opencode / bp-aider / bp-cursor-server later — the chart's structure (controller + per-user pod + identity- blind runtime image) is intentionally reusable.


What this chart contains

File Purpose
Chart.yaml Metadata; catalyst.openova.io/no-upstream: "true" (no upstream Helm chart)
values.yaml All operator-tunable values; assertions in _helpers.tpl fail render with helpful messages when required values are missing
templates/_helpers.tpl Naming / labels / required-value assertions
templates/serviceaccount.yaml Controller ServiceAccount (release ns)
templates/controller-rbac.yaml Namespaced Role + RoleBinding in tenant ns. create verbs split into separate rules WITHOUT resourceNames per feedback_rbac_create_no_resourcenames.md
templates/controller-deployment.yaml Multi-tenant controller pod
templates/controller-service.yaml ClusterIP Service for the controller
templates/controller-ingress.yaml Public hostname openclaw.<sme-domain> with cert-manager auto-issue
templates/per-user-pod-template.yaml ConfigMap holding the pod-spec the controller renders per session
templates/networkpolicy.yaml Controller NetworkPolicy. The per-user pod's NetworkPolicy is rendered by the controller at session-start (see "Per-user pod NetworkPolicy" below)
runtime/ Source for the per-user runtime container image (separate OCI artifact, built by .github/workflows/openclaw-runtime.yaml)
tests/render-toggles.sh Helm-template integration test exercised by the blueprint-release CI workflow

Required overlay values

The chart fails to render if any of these are unset (see _helpers.tpl :: assertRequired):

Value Example
oidc.issuerURL https://keycloak.acme.<parent-domain>/realms/sme-acme
oidc.clientId openclaw
oidc.clientSecret.name openclaw-oidc-client-secret (Secret with key OIDC_CLIENT_SECRET)
llm.baseURL https://api.acme.<parent-domain>/v1 (per-tenant NewAPI OpenAI-compatible endpoint)
llm.apiKey.name openclaw-newapi-controller-token (Secret with key NEWAPI_KEY)
llm.defaultModel qwen3.6 (NewAPI maps this to a backing channel — e.g. Qwen3.6@BankDhofar)
tenant.namespace sme-acme
controller.image.tag SHA-pinned tag (Inviolable Principle 4)
perUserPod.image.tag SHA-pinned tag (Inviolable Principle 4)
ingress.host openclaw.acme.<parent-domain>

Legacy keycloak.* / newapi.* keys remain accepted for back-compat (see umbrella epic #915).


Runtime image contract

Per locked decision [A] of #795, the per-user runtime image reads only two env vars:

Env Source
NEWAPI_BASE_URL secretKeyRef: name=newapi-key-{uuid}, key=base-url
NEWAPI_KEY secretKeyRef: name=newapi-key-{uuid}, key=api-key

It carries no Keycloak code, no key-management code, no knowledge of the SME tenant model. Identity-blind by construction.

The runtime/ directory in this Blueprint ships a minimal reference implementation that satisfies the contract: a Go binary exposing an HTTP /healthz and a /v1/chat/completions proxy that forwards to NewAPI with the injected key. Operators may override perUserPod.image.repository to point at any other image satisfying the same env-vars contract — e.g. a fork of upstream OpenClaw, an OpenAI-compatible coding-CLI image, etc.

Why a stub instead of the upstream OpenClaw

Upstream OpenClaw (https://openclaw.ai) targets messaging-platform integration (WhatsApp / Telegram / Slack) — a different shape from the "per-user agentic workspace" that #795 calls for. There's no upstream Helm chart, and the upstream container expects a wholly different configuration surface (no NEWAPI_BASE_URL env). Rather than fork the upstream and graft a NewAPI driver onto it (significant work, owned by a different ticket), this Blueprint ships a contract-minimal runtime: identity-blind, two env vars, OpenAI-compatible proxy. Future work can swap the runtime image without changing this chart.


RBAC posture

The controller's Role lives in the tenant namespace (where the per-user pods and Secrets live), not the release namespace. The RoleBinding subject is the controller's ServiceAccount in the release namespace.

create verbs are split into their own rules with no resourceNames. This is mandatory: the K8s authorizer rejects create combined with resourceNames (you can't constrain a not-yet-existing resource by name). Label-based ownership (catalyst.openova.io/openclaw-user) is enforced at the controller, not in RBAC. See feedback_rbac_create_no_resourcenames.md for the full incident report.


Per-user pod NetworkPolicy

This chart's networkpolicy.yaml covers the controller pod only. Each per-user pod gets its own NetworkPolicy applied by the controller at session-start, restricting egress to:

  • NewAPI (operator's customer-facing hostname or in-cluster Service)
  • DNS (kube-system :53)

The per-user pod NetworkPolicy is rendered by the controller from a template baked into its container image; it cannot be a static chart template because the egress target (specifically the NewAPI hostname) is read from the per-user newapi-key-{uuid} Secret, which doesn't exist at chart-render time.


Build + publish

The chart is built and published by the existing event-driven CI workflow .github/workflows/blueprint-release.yaml whenever platform/openclaw/chart/** changes on main. Output:

oci://ghcr.io/openova-io/bp-openclaw:0.1.0

The runtime image is built by a sister workflow, .github/workflows/openclaw-runtime.yaml, on push to platform/openclaw/runtime/**. Output:

ghcr.io/openova-io/openova/openclaw-runtime:<sha>

Per Inviolable Principle 1, both workflows are event-driven; neither uses schedule: cron. workflow_dispatch is provided for re-runs only.


  • Epic #795 — SME-tenant turnkey experience
  • ADR-0003 — RBAC ↔ NewAPI user-create hook
  • bp-newapi (platform/newapi/) — Sovereign-level metered LLM gateway
  • bp-keycloak (platform/keycloak/) — SME-vcluster realm