History

e3mrah a2685dd158 feat: add request tracing spans to chat completion path Traces: convLookup, formatPrompt, acquire, send, firstMsg, stream, release, convStore — logged per request for profiling. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>		2026-03-04 11:20:54 +01:00
..
chart	chore: increase Axon resource limits for single-node overprovisioning	2026-03-04 11:08:10 +01:00
scripts	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
src	feat: add request tracing spans to chat completion path	2026-03-04 11:20:54 +01:00
.env.example	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
Containerfile	fix: use fixed UID 1001 for axon user in container	2026-03-04 09:39:09 +01:00
package-lock.json	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
package.json	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00
README.md	feat: restructure platform to 52 components and 9 products	2026-02-26 21:00:19 +00:00
tsconfig.json	feat: OpenOva Axon — stateless SDK, Valkey state store, 100% OpenAI-compatible API	2026-02-28 18:36:26 +04:00

README.md

OpenOva Axon

SaaS LLM inference gateway connecting to OpenOva Cortex.

Status: Accepted | Updated: 2026-02-26

Overview

OpenOva Axon is a hosted inference gateway that provides managed access to LLM capabilities. It acts as the neural link between customer applications and OpenOva Cortex (self-hosted AI Hub), offering subscription-based LLM access without requiring customers to deploy their own GPU infrastructure.

flowchart LR
    subgraph Customer["Customer Environment"]
        App[Applications]
        Claude[Claude Code]
    end

    subgraph Axon["OpenOva Axon (SaaS)"]
        Gateway[API Gateway]
        Auth[Authentication]
        Metering[Usage Metering]
    end

    subgraph Cortex["OpenOva Cortex (Self-Hosted)"]
        KServe[KServe]
        vLLM[vLLM]
    end

    Customer --> Axon
    Axon --> Cortex

Key Features

OpenAI-compatible API endpoint
Subscription-based access with usage metering
Automatic routing to optimal model instances
Rate limiting and quota management
Claude Code integration via ANTHROPIC_BASE_URL

Relationship to Cortex

Aspect	Cortex	Axon
Deployment	Self-hosted (customer cluster)	SaaS (OpenOva hosted)
GPU	Customer provides	OpenOva provides
Use case	Full AI platform	LLM inference only
Control	Full customization	Managed service

Usage

# Configure Claude Code with Axon
export ANTHROPIC_BASE_URL="https://axon.openova.io/v1"
export ANTHROPIC_API_KEY="your-subscription-token"

# Use Claude Code normally
claude "Explain this code..."

Deployment

Axon is a SaaS service operated by OpenOva. No customer deployment required.

For self-hosted inference, deploy OpenOva Cortex instead.

Part of OpenOva