Models Pricing Docs Guides Use cases Compare About

Log in Get started

New — Gemini 3, Claude Opus 4.7 and Veo 3 now live

Frontier models,
Up to 70% off official rates.

The frontier models from OpenAI, Anthropic and Google — Claude, Gemini, GPT-Image, Veo — every one priced 30–70% below the provider’s official rate, behind a single OpenAI-compatible API. Change one line of base_url and you’re shipping.

Get started Read the docs

5-second setup · No credit card · No minimums

Paste into your AI agent

Use Kunavo as your model provider — an OpenAI-compatible gateway to every frontier text, image and video model.

base_url:  https://api.kunavo.com/v1
auth:      Authorization: Bearer $KUNAVO_API_KEY

To use a model, call GET /v1/models for the live catalog, then route each model by its kunavo.endpoint field. Full agent reference: https://kunavo.com/llms.txt

Providers we’ve unified

OpenAIAnthropicGoogle

30–70%

Off official rates

3,200+

Active developers

240M+

Monthly API calls

99.95%

Uptime SLA

<120ms

P50 latency

200+

Models live

Why Kunavo

The AI gateway built for builders who ship.

From the routing layer to the billing ledger, every part of Kunavo was designed for indie developers and small teams shipping AI features for real customers.

Global edge gateway

Multi-region Anycast routing with TLS termination at the edge. P50 under 120ms from anywhere — North America, EU, APAC.

OpenAI-compatible

Drop-in replacement for OpenAI SDKs. Streaming, function calling, tool use, vision — all wire-compatible. No new client to learn.

Stripe-native billing

Card, Apple Pay, Google Pay, ACH, SEPA, Alipay, WeChat Pay — every method Stripe supports, we support. Self-serve top-ups, auto-recharge, invoices.

Frontier models, up to 70% off

Every model from OpenAI, Anthropic and Google, priced 30–70% below the provider’s official rate. Claude, Gemini, GPT-Image, Veo — text, image and video, one bill.

Transparent pricing

Every model’s per-1M-token price is published. No hidden multipliers, no surprise overages. Failed requests are never billed.

99.95% SLA

Multi-provider failover happens in under 50ms. If one upstream wobbles, your request is rerouted before you notice.

First-class streaming

Native SSE pass-through. Time-to-first-token matches the upstream provider — no buffering, no batching, no delay.

Granular usage data

Per-call analytics by model, key and IP. Webhook deliveries for usage events. Export everything as CSV when you need it.

Prompt caching, up to 90% off

Anthropic cache reads bill at 10% of input — pass cache_control on your system prompt and long context becomes a near-free re-read. Hit rate and savings are shown live in your dashboard.

−90%

Use cases

What to build with Kunavo.

Browse all use cases

Model catalog

Frontier models, up to 70% off official.

Browse the full catalog

Claude Fable 5

Anthropic's most capable model — frontier reasoning, long-horizon agents, 1M context.

visionfunctionstreamingthinking

$10 / $50per 1M tokens

Claude Opus 5

Near-flagship Opus reasoning at half the price of Fable 5 — vision, agentic coding, 200K context.

visionfunctionstreamingthinking

$5 / $25per 1M tokens

Claude Opus 5 Fast

Opus 5 tuned for latency — same frontier reasoning, faster output, 200K context.

visionfunctionstreamingthinking

$10 / $50per 1M tokens

Claude Opus 4.7

Anthropic's newest Opus — flagship reasoning, vision, 200K context.

visionfunctionstreamingthinking

$5 / $25per 1M tokens

Claude Opus 4.6

Anthropic Opus 4.6 — deep reasoning, exceptional agentic ability.

visionfunctionstreamingthinking

$5 / $25per 1M tokens

Claude Sonnet 5

Near-Opus coding and agentic quality at Sonnet cost — 1M context.

visionfunctionstreamingthinking

$3 / $15per 1M tokens

Claude Sonnet 4.6

Balanced speed/quality — the everyday production workhorse, elite coding.

visionfunctionstreamingthinking

$3 / $15per 1M tokens

Claude Haiku 4.5

Anthropic Haiku 4.5 — fast and cost-efficient.

visionfunctionstreaming

$1 / $5per 1M tokens

Gemini 3.6 Flash

Google's newest Flash — thinking-by-default at Flash latency, 1M context, native audio input.

visionfunctionstreamingthinking

$1.5 / $7.5per 1M tokens

Gemini 2.5 Pro

Previous-gen Gemini Pro — strong reasoning and vision.

visionfunctionstreamingthinking

$1.25 / $10per 1M tokens

Gemini 2.5 Flash

Previous-gen Gemini Flash — extreme value.

visionfunctionstreaming

$0.3 / $2.5per 1M tokens

GPT-5.6 Sol

OpenAI's newest flagship — top-tier reasoning and agentic coding, 1.05M context.

functionstreamingthinkinglong-context

$5 / $30per 1M tokens

GPT-5.4

OpenAI GPT-5.4 — strong general reasoning, 1M context.

functionstreamingthinkinglong-context

$2.5 / $15per 1M tokens

GPT-5.5

OpenAI GPT-5.5 — flagship reasoning, 1M context.

functionstreamingthinkinglong-context

$5 / $30per 1M tokens

GPT-5.4 Mini

OpenAI GPT-5.4 Mini — fast, cost-efficient reasoning.

functionstreamingthinking

$0.75 / $4.5per 1M tokens

GPT-5.3 Codex

OpenAI GPT-5.3 Codex — coding-specialized.

functionstreamingthinkinglong-context

$1.75 / $14per 1M tokens

GPT-5.5 Pro

OpenAI GPT-5.5 Pro — deep-horizon enterprise reasoning, 1.1M context.

functionstreamingthinkinglong-context

$30 / $180per 1M tokens

For AI agents

Point your agent at `llms.txt`
It uses every model itself.

Hand one instruction to Claude Code, Cursor, Cline — or any OpenAI-compatible agent. It reads the live model catalog from Kunavo and drives text, image and video models on its own. No SDK, no glue code.

OpenAI-wire compatible — agents need no custom integration
GET /v1/models is the live catalog — never hardcode model names
One key for every modality: text, image, video, audio

Paste into your AI agent

Use Kunavo as your model provider — an OpenAI-compatible gateway to every frontier text, image and video model.

base_url:  https://api.kunavo.com/v1
auth:      Authorization: Bearer $KUNAVO_API_KEY

To use a model, call GET /v1/models for the live catalog, then route each model by its kunavo.endpoint field. Full agent reference: https://kunavo.com/llms.txt

Top up & save

The more you pre-pay, the more you save.

Pre-paid wallet. $10 starts you up. No subscription, no minimum, balance never expires.

Starter

Just exploring

$10

Access to every model
Per-call usage analytics
Community & email support
No minimum, no credit card

Most popular

Builder

Limited · +$10

Shipping a product

$100

$100 deposit = $110 credit
10 isolated API keys
Auto-recharge · IP allowlist
Priority email support

Scale

Limited · +$250

Running production traffic

$1000

$1000 deposit = $1250 credit
Unlimited API keys
Webhooks · monthly invoices
Dedicated Slack/Discord support

Enterprise

Limited · +$2000

High-volume scale

$5000

$5000 deposit = $7000 credit
Everything in Scale
Custom rate limits & SLA
Dedicated account manager

See the full pricing table

Guides

Start with the popular guides.

Browse all guides

From the blog

Recent deep dives.

FAQ

Everything you’re
wondering about.

Didn’t answer your question? Email us at contact@kunavo.com — we reply within 24 hours.

Kunavo is purpose-built for indie developers and small teams shipping production AI features. Three real differences: (1) we cover text, image and video under one bill — many aggregators are text-only; (2) Stripe-native checkout, ACH, SEPA, Apple Pay, WeChat Pay all included — no off-platform invoices; (3) full transparency on routing — we never silently swap your model to a cheaper one.
Every model is priced at roughly 30–70% below the provider's official list price — and bigger top-ups add a further bonus on top. You also save operationally: one contract, one invoice, one SDK, no commitment minimums. The per-1M-token price for every model is published on /pricing — easy to compare against the upstream listing anytime.
Yes. We implement the full set of OpenAI endpoints: /v1/chat/completions, /v1/embeddings, /v1/images/generations, /v1/models and /v1/video/generations. Streaming, function calling, vision and tool use all behave identically. Projects using the OpenAI SDK migrate by changing base_url — that’s it.
No. Kunavo is a pre-paid wallet. Top-ups stay in your account forever — no subscriptions, no monthly minimums, no expiration. Account closure refunds remaining balance to your original payment method.
Never. 4xx and 5xx responses are not billed. Streaming responses that disconnect mid-flight are billed only for the tokens actually delivered. Every charge is visible per-call in the usage dashboard, exportable as CSV for accounting.
Everything Stripe supports: cards (Visa, Mastercard, Amex, JCB, UnionPay), Apple Pay, Google Pay, Link, ACH, SEPA, BACS, BECS, Alipay, WeChat Pay, Klarna, Afterpay and more. Auto-recharge is opt-in. Enterprise customers can pay by invoice with Net 30 terms.
Edge gateway nodes are deployed across North America, Europe and Asia-Pacific. Stateless routing logic runs at the edge for sub-120ms P50 latency. Billing data, accounts and audit logs are stored in a primary region with multi-region replication.

Three minutes to your first call.

One OpenAI-compatible API for Claude, Gemini, GPT-Image and 200+ more — $5 minimum top-up, pay only for what you call.

Get started Read the quickstart