openova/platform/anthropic-adapter
e3mrah 87d9a4afa7
feat(charts): bp-temporal + bp-llm-gateway + bp-anthropic-adapter wrapper charts (closes #267 #268 #271) (#288)
W2.5.E batch — three Application-tier Blueprints completing the LLM
serving / workflow stack:

- bp-temporal/1.0.0 — wraps temporal/temporal 1.2.0 (the new chart
  rewrite that removed cassandra:/mysql:/postgresql:/elasticsearch:/
  prometheus:/grafana: top-level keys in favour of
  server.config.persistence.datastores). Postgres-only via CNPG-backed
  visibility store (skip Cassandra). Web UI ON. Keycloak OIDC
  integration via --auth-claim-mapper renders auth.yaml ConfigMap
  (operator wires via additionalVolumes once bp-keycloak is
  reconciled, default OFF). dependsOn: bp-cnpg + bp-cert-manager.
  Closes #271.
  Kinds: Cluster (CNPG) + ConfigMap + Deployment + Job + Pod +
  Service.

- bp-llm-gateway/1.0.0 — wraps berriai/litellm-helm 0.1.572 from OCI.
  Subscription-aware proxy for Claude Code: routes to Anthropic (via
  operator OAuth/Max subscription — NEVER an ANTHROPIC_API_KEY,
  per memory/feedback_no_api_key.md), Bedrock, Vertex,
  OpenAI-compatible (via bp-anthropic-adapter), and self-hosted
  vLLM. CNPG-backed audit log (every prompt + response persisted
  for compliance). Bundled bitnami postgresql + redis subcharts
  DISABLED (db.useExisting=true points at the CNPG cluster).
  Keycloak SSO via auth.yaml ConfigMap (default OFF).
  ExternalSecret-backed environmentSecrets brings tokens / IAM
  creds in without inlining plaintext. dependsOn: bp-cnpg +
  bp-keycloak + bp-external-secrets. Closes #267.
  Kinds: Cluster (CNPG audit) + ConfigMap + Deployment + Job +
  Pod + Secret + Service + ServiceAccount.

- bp-anthropic-adapter/1.0.0 — Catalyst-authored scratch chart for
  the OpenAI ↔ Anthropic translation Go service. SHA-pinned image
  ghcr.io/openova-io/openova/anthropic-adapter:<sha> (Inviolable
  Principle #4a — GitHub Actions is the only build path; empty
  default tag fails the render with a clear error instead of
  silently shipping :latest). OAuth/Max subscription token mounted
  from K8s Secret materialized by ESO from bp-openbao —
  ANTHROPIC_OAUTH_TOKEN env var, NEVER an ANTHROPIC_API_KEY.
  Includes OpenAI → Anthropic model-mapping ConfigMap (gpt-4 →
  claude-3-5-sonnet, gpt-4o-mini → claude-3-5-haiku, etc.).
  sigstore/common library subchart included to satisfy the
  hollow-chart gate (matches bp-vllm pattern from #283).
  dependsOn: bp-external-secrets. Closes #268.
  Kinds: ConfigMap + Deployment + Service + ServiceAccount.

CRITICAL — bp-llm-gateway and bp-anthropic-adapter both consume the
operator's Claude OAuth/Max subscription. Per memory/
feedback_no_api_key.md and the user's standing instruction, neither
chart accepts or generates an ANTHROPIC_API_KEY. Tokens flow
exclusively through ExternalSecret-managed K8s Secrets that ESO
materializes from bp-openbao at install time.

Per docs/BLUEPRINT-AUTHORING.md §11.2 (issue #182): every
observability toggle defaults `false` (ServiceMonitor / metrics
sidecar / PodMonitor) and is operator-tunable via per-cluster
overlay once bp-kube-prometheus-stack reconciles. Each chart ships
tests/observability-toggle.sh covering default-off, opt-in (with
--api-versions monitoring.coreos.com/v1 to simulate the CRDs), and
explicit-off cases. bp-anthropic-adapter additionally tests the
never-:latest gate via Case 4 (empty image tag must fail render).

Per docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode): every
upstream version, namespace, server URL, role, secret name, model
default, and toggle is exposed under values.yaml. Cluster overlays
in clusters/<sovereign>/ may override without rebuilding the
Blueprint OCI artifact.

Per docs/BLUEPRINT-AUTHORING.md §11.1 (umbrella shape — hard
contract): bp-temporal and bp-llm-gateway declare their upstream
charts under Chart.yaml dependencies: so helm dependency build
bundles the upstream payload into the OCI artifact. bp-anthropic-
adapter is a scratch chart (no upstream Helm chart exists) and
includes sigstore/common as the obligatory hollow-chart-gate
dependency, matching the bp-vllm precedent from W2.5.D (#283).

Closes #267
Closes #268
Closes #271

helm lint: 1 chart(s) linted, 0 chart(s) failed (each, INFO icon-recommended only)

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-04-30 19:37:19 +04:00
..
chart feat(charts): bp-temporal + bp-llm-gateway + bp-anthropic-adapter wrapper charts (closes #267 #268 #271) (#288) 2026-04-30 19:37:19 +04:00
blueprint.yaml feat(charts): bp-temporal + bp-llm-gateway + bp-anthropic-adapter wrapper charts (closes #267 #268 #271) (#288) 2026-04-30 19:37:19 +04:00
README.md docs(pass-32): registry-DNS sweep — harbor.<domain> across 9 component READMEs 2026-04-27 22:36:39 +02:00

Anthropic Adapter

OpenAI-compatible proxy for Anthropic Claude API. Application Blueprint (see docs/PLATFORM-TECH-STACK.md §4.6). Lets Apps written against the OpenAI SDK call Anthropic Claude with no code change. Pairs with the LLM Gateway in bp-cortex.

Status: Accepted | Updated: 2026-04-27


Overview

Anthropic Adapter provides an OpenAI-compatible API layer that translates requests to the Anthropic Claude API format, enabling tools like Claude Code to work with internal models.

flowchart LR
    subgraph Adapter["Anthropic Adapter"]
        Translate[Request Translator]
        Stream[Stream Handler]
    end

    ClaudeCode[Claude Code] -->|OpenAI Format| Adapter
    Adapter -->|Anthropic Format| Claude[Claude API]
    Adapter -->|OpenAI Format| Internal[Internal LLM]

    Claude --> Adapter
    Internal --> Adapter
    Adapter --> ClaudeCode

Why Anthropic Adapter?

Feature Benefit
API translation OpenAI ↔ Anthropic format
Claude Code support Use internal models with Claude Code
Streaming Real-time response translation
Model routing Route to Claude or internal LLM

Use Cases

Use Case Description
Claude Code + internal LLM Use Claude Code with self-hosted models
API compatibility Anthropic clients on OpenAI backends
Model switching Seamless backend switching

Configuration

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: anthropic-adapter
  namespace: ai-hub
spec:
  replicas: 2
  template:
    spec:
      containers:
        - name: adapter
          image: harbor.<location-code>.<sovereign-domain>/ai-hub/anthropic-adapter:latest
          ports:
            - containerPort: 8000
          env:
            - name: BACKEND_TYPE
              value: "openai"  # or "anthropic"
            - name: BACKEND_URL
              value: "http://vllm.ai-hub.svc:8000/v1"
            - name: BACKEND_API_KEY
              valueFrom:
                secretKeyRef:
                  name: adapter-secrets
                  key: backend-api-key
            - name: DEFAULT_MODEL
              value: "qwen3-32b"
          resources:
            requests:
              cpu: 100m
              memory: 256Mi

API Translation

Anthropic → OpenAI

Anthropic OpenAI
messages[].content (list) messages[].content (string)
max_tokens max_tokens
system (top-level) messages[0].role: system
stream: true stream: true

Request Translation

# Anthropic format (input)
{
    "model": "claude-3-opus",
    "max_tokens": 4096,
    "system": "You are helpful.",
    "messages": [
        {"role": "user", "content": "Hello"}
    ]
}

# OpenAI format (translated)
{
    "model": "qwen3-32b",
    "max_tokens": 4096,
    "messages": [
        {"role": "system", "content": "You are helpful."},
        {"role": "user", "content": "Hello"}
    ]
}

Claude Code Configuration

# Set Claude Code to use adapter
export ANTHROPIC_API_KEY="your-adapter-key"
export ANTHROPIC_BASE_URL="http://anthropic-adapter.ai-hub.svc:8000"

# Claude Code will now use internal LLM
claude-code "Explain this code..."

Implementation

# proxy.py
from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse
import httpx

app = FastAPI()

@app.post("/v1/messages")
async def messages(request: Request):
    body = await request.json()

    # Translate Anthropic → OpenAI format
    openai_body = translate_to_openai(body)

    # Forward to backend
    async with httpx.AsyncClient() as client:
        if body.get("stream"):
            return StreamingResponse(
                stream_response(client, openai_body),
                media_type="text/event-stream"
            )
        else:
            response = await client.post(
                f"{BACKEND_URL}/chat/completions",
                json=openai_body
            )
            return translate_to_anthropic(response.json())


def translate_to_openai(anthropic_body: dict) -> dict:
    messages = []

    # Move system to first message
    if "system" in anthropic_body:
        messages.append({
            "role": "system",
            "content": anthropic_body["system"]
        })

    # Convert message content
    for msg in anthropic_body.get("messages", []):
        content = msg["content"]
        if isinstance(content, list):
            # Flatten content blocks
            content = " ".join(
                block.get("text", "")
                for block in content
                if block.get("type") == "text"
            )
        messages.append({"role": msg["role"], "content": content})

    return {
        "model": DEFAULT_MODEL,
        "messages": messages,
        "max_tokens": anthropic_body.get("max_tokens", 4096),
        "stream": anthropic_body.get("stream", False)
    }

Streaming Translation

async def stream_response(client, openai_body):
    async with client.stream(
        "POST",
        f"{BACKEND_URL}/chat/completions",
        json=openai_body
    ) as response:
        async for line in response.aiter_lines():
            if line.startswith("data: "):
                data = json.loads(line[6:])
                # Translate to Anthropic SSE format
                anthropic_event = translate_sse(data)
                yield f"event: content_block_delta\ndata: {json.dumps(anthropic_event)}\n\n"
        yield "event: message_stop\ndata: {}\n\n"

Monitoring

Metric Query
Request count adapter_requests_total
Latency adapter_request_duration_seconds
Backend errors adapter_backend_errors_total
Stream duration adapter_stream_duration_seconds

Consequences

Positive:

  • Claude Code with internal models
  • API format translation
  • Streaming support
  • Easy model switching

Negative:

  • Feature parity limitations
  • Translation overhead
  • Some Anthropic features unsupported

Part of OpenOva