History

e3mrah 69706a80ec feat(axon): make qwen3-coder thinking mode toggleable via request parameter Client sends `thinking: true` to enable reasoning tokens. Default remains disabled for instant streaming. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-04-26 09:20:33 +02:00
..
chart	feat(axon): add toggleable vLLM provider backend	2026-04-26 07:36:58 +02:00
scripts	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
src	feat(axon): make qwen3-coder thinking mode toggleable via request parameter	2026-04-26 09:20:33 +02:00
.env.example	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
Containerfile	fix: use fixed UID 1001 for axon user in container	2026-03-04 09:39:09 +01:00
package-lock.json	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
package.json	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
README.md	docs: rewrite Axon README as client integration guide	2026-03-04 13:09:58 +01:00
tsconfig.json	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00

README.md

OpenOva Axon

OpenAI-compatible API gateway backed by Claude. Drop-in replacement for any OpenAI SDK client.

Base URL: https://api.openova.io/axon/v1

Quick Start

curl https://api.openova.io/axon/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

SDK Integration

Python

pip install openai

from openai import OpenAI

client = OpenAI(
    base_url="https://api.openova.io/axon/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Node.js

npm install openai

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.openova.io/axon/v1",
  apiKey: "YOUR_API_KEY",
});

const response = await client.chat.completions.create({
  model: "claude-sonnet-4-6",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

Environment Variables

Works with any tool that reads OPENAI_BASE_URL and OPENAI_API_KEY:

export OPENAI_BASE_URL="https://api.openova.io/axon/v1"
export OPENAI_API_KEY="YOUR_API_KEY"

Models

Model ID	Alias (OpenAI)	Description
`claude-opus-4-6`	`gpt-4`	Most capable
`claude-sonnet-4-6`	`gpt-4o`, `gpt-4-turbo`	Balanced speed and quality (default)
`claude-haiku-4-5`	`gpt-3.5-turbo`, `gpt-4o-mini`	Fastest

OpenAI model names are automatically mapped to their Claude equivalents.

# List available models
curl https://api.openova.io/axon/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

API Reference

`POST /v1/chat/completions`

OpenAI-compatible chat completion endpoint.

Request:

{
  "model": "claude-sonnet-4-6",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is Kubernetes?"}
  ],
  "stream": false,
  "temperature": 0.7,
  "max_tokens": 1024
}

Response:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "claude-sonnet-4-6",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "..."},
    "finish_reason": "stop"
  }],
  "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0},
  "conversation_id": "conv-..."
}

Streaming

Set "stream": true to receive Server-Sent Events:

curl -N https://api.openova.io/axon/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "stream": true,
    "messages": [{"role": "user", "content": "Count to 5."}]
  }'

Conversations

Every response includes a conversation_id. Pass it back to continue the conversation:

# Start a conversation
CONV_ID=$(curl -s https://api.openova.io/axon/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"claude-sonnet-4-6","messages":[{"role":"user","content":"Remember: the answer is 42."}]}' \
  | jq -r '.conversation_id')

# Continue it
curl https://api.openova.io/axon/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"model\":\"claude-sonnet-4-6\",\"conversation_id\":\"$CONV_ID\",\"messages\":[{\"role\":\"user\",\"content\":\"What is the answer?\"}]}"

`GET /v1/models`

List available models. Requires authentication.

`GET /health`

Health check. No authentication required. Returns {"status": "ok"}.

`GET /stats`

Pool and conversation metrics. No authentication required.

Supported Parameters

Parameter	Type	Description
`model`	string	Model ID or OpenAI alias
`messages`	array	Chat messages (`role` + `content`)
`stream`	boolean	Enable SSE streaming (default: `false`)
`temperature`	number	Sampling temperature
`max_tokens`	number	Maximum response length
`top_p`	number	Nucleus sampling
`stop`	string/array	Stop sequences
`conversation_id`	string	Continue an existing conversation
`response_format`	object	`{"type": "json_object"}` for JSON output
`stream_options`	object	`{"include_usage": true}` for usage in stream

Authentication

All /v1/* endpoints require a Bearer token:

Authorization: Bearer YOUR_API_KEY

Unauthenticated requests return 401.

Architecture

flowchart LR
    Client[OpenAI SDK Client] -->|HTTPS| Ingress[Traefik Ingress]
    Ingress -->|StripPrefix /axon| Axon[Axon Gateway :3000]
    Axon -->|Session Pool| Claude[Claude API]
    Axon -->|Conversations| Valkey[(Valkey :6379)]

Axon maintains a pre-warmed pool of Claude Agent SDK sessions for low-latency responses. Conversations are stored in Valkey with a 7-day TTL.

Self-Hosting

Axon ships as a Helm chart at products/axon/chart/. See chart/values.yaml for configuration.

# Requires: Claude subscription credentials, K8s cluster with Traefik
helm install axon ./products/axon/chart \
  --namespace axon --create-namespace \
  --set image.tag=latest \
  --set ingress.host=your-domain.com

Required secrets:

axon-secrets — AXON_API_KEYS (comma-separated bearer tokens)
axon-claude-auth — .credentials.json from a Claude subscription
ghcr-pull-secret — Docker registry credentials for GHCR

Part of OpenOva