Frontier models,
30% off official rates.
The frontier models from OpenAI, Anthropic and Google — Claude, Gemini, GPT-Image, Veo — every one priced 30% below the provider’s official rate, behind a single OpenAI-compatible API. Change one line of base_url and you’re shipping.
Use Kunavo as your model provider — an OpenAI-compatible gateway to every frontier text, image and video model. base_url: https://api.kunavo.com/v1 auth: Authorization: Bearer $KUNAVO_API_KEY To use a model, call GET /v1/models for the live catalog, then route each model by its kunavo.endpoint field. Full agent reference: https://kunavo.com/llms.txt
Providers we’ve unified
The AI gateway built for builders who ship.
From the routing layer to the billing ledger, every part of Kunavo was designed for indie developers and small teams shipping AI features for real customers.
Global edge gateway
Multi-region Anycast routing with TLS termination at the edge. P50 under 120ms from anywhere — North America, EU, APAC.
OpenAI-compatible
Drop-in replacement for OpenAI SDKs. Streaming, function calling, tool use, vision — all wire-compatible. No new client to learn.
Stripe-native billing
Card, Apple Pay, Google Pay, ACH, SEPA, Alipay, WeChat Pay — every method Stripe supports, we support. Self-serve top-ups, auto-recharge, invoices.
Frontier models, 30% off
Every model from OpenAI, Anthropic and Google, priced 30% below the provider’s official rate. Claude, Gemini, GPT-Image, Veo — text, image and video, one bill.
Transparent pricing
Every model’s per-1M-token price is published. No hidden multipliers, no surprise overages. Failed requests are never billed.
99.95% SLA
Multi-provider failover happens in under 50ms. If one upstream wobbles, your request is rerouted before you notice.
First-class streaming
Native SSE pass-through. Time-to-first-token matches the upstream provider — no buffering, no batching, no delay.
Granular usage data
Per-call analytics by model, key and IP. Webhook deliveries for usage events. Export everything as CSV when you need it.
Prompt caching, up to 90% off
Anthropic cache reads bill at 10% of input — pass cache_control on your system prompt and long context becomes a near-free re-read. Hit rate and savings are shown live in your dashboard.
What to build with Kunavo.
- Customer Support
AI customer support
The fastest-ROI AI deployment in any B2C SaaS — automate ticket triage, draft 80% of responses, and escalate the rest cleanly. Production code, real cost numbers, and the compliance pitfalls that catch teams off-guard.
Explore - Knowledge Base
RAG chatbot API
Most internal knowledge bases are dead documentation — nobody finds anything. A Claude-backed RAG chatbot turns them into a real assistant that cites sources and refuses when it doesn't know. Here's the production pattern.
Explore - Trust & Safety
AI content moderation
Modern moderation isn't just regex — it's nuance: sarcasm, dog whistles, brand-context misuse, image+text combinations. LLMs do this far better than rule-based systems, at a price that scales.
Explore - Developer Tools
AI code assistant
Cursor, Aider, Cline, Continue.dev — they're all powered by the same handful of frontier LLMs. If you're building a coding tool (or a co-pilot inside your own dev product), here's the architecture and the cost reality.
Explore - Data Processing
AI data extraction
The boring, valuable use case. Invoices, receipts, contracts, leads, resumes — anywhere you'd previously have built a parser, an LLM with JSON-mode does it in 30 lines, more accurately, and you can ship in a day instead of a quarter.
Explore
Frontier models, 30% off official.
Claude Opus 4.7
Anthropic's newest Opus — flagship reasoning, vision, 200K context.
Claude Opus 4.6
Anthropic Opus 4.6 — deep reasoning, exceptional agentic ability.
Claude Sonnet 4.6
Balanced speed/quality — the everyday production workhorse, elite coding.
Claude Sonnet 4.5
Anthropic Sonnet 4.5 — production workhorse.
Claude Haiku 4.5
Anthropic Haiku 4.5 — fast and cost-efficient.
Gemini 3 Pro
Google's flagship — native multimodal, 1M+ context, chain-of-thought.
Gemini 3.1 Pro
Latest Gemini 3.1 Pro — incremental quality bump.
Gemini 3 Flash
Cost-efficient Gemini — millisecond responses for high-frequency calls.
Gemini 2.5 Pro
Previous-gen Gemini Pro — strong reasoning and vision.
Gemini 2.5 Flash
Previous-gen Gemini Flash — extreme value.
Point your agent at llms.txt
It uses every model itself.
Hand one instruction to Claude Code, Cursor, Cline — or any OpenAI-compatible agent. It reads the live model catalog from Kunavo and drives text, image and video models on its own. No SDK, no glue code.
- OpenAI-wire compatible — agents need no custom integration
- GET /v1/models is the live catalog — never hardcode model names
- One key for every modality: text, image, video, audio
Use Kunavo as your model provider — an OpenAI-compatible gateway to every frontier text, image and video model. base_url: https://api.kunavo.com/v1 auth: Authorization: Bearer $KUNAVO_API_KEY To use a model, call GET /v1/models for the live catalog, then route each model by its kunavo.endpoint field. Full agent reference: https://kunavo.com/llms.txt
The more you pre-pay, the more you save.
Pre-paid wallet. $10 starts you up. No subscription, no minimum, balance never expires.
Starter
Just exploring
- Access to every model
- Per-call usage analytics
- Community & email support
- No minimum, no credit card
Builder
Limited · +$10Shipping a product
- $100 deposit = $110 credit
- 10 isolated API keys
- Auto-recharge · IP allowlist
- Priority email support
Scale
Limited · +$250Running production traffic
- $1000 deposit = $1250 credit
- Unlimited API keys
- Webhooks · monthly invoices
- Dedicated Slack/Discord support
Enterprise
Limited · +$2000High-volume scale
- $5000 deposit = $7000 credit
- Everything in Scale
- Custom rate limits & SLA
- Dedicated account manager
Recent deep dives.
- 实战·5 min
在中国稳定调用 Claude / GPT / Gemini — Kunavo 中国友好路由实测
实测从北京/上海/深圳调用 Kunavo 的延迟和成功率:无需代理,P50 80-150ms,成功率 99.9%+。支付宝/微信支付/双币卡都可用。健壮重试代码 + 何时真的需要代理。
- 教學·6 min
用 Veo 3 與 Sora 為台灣品牌做短影片廣告 — 5 分鐘完整教學
從文字 prompt 到 9:16 直式 Reels / TikTok / IG 短影片,全流程教學。圖生影片用既有產品照片動起來。5 種台灣品牌實際應用、繁中文字渲染注意事項、商業可用品質的進階技巧。
- 実装ガイド·8 min
日本語 RAG チャットボットを Claude で構築 — 5,000 文書のナレッジベースを 30 行で
社内ドキュメント 5,000 件を Claude Sonnet 4.6 で検索可能にする RAG 完全実装。埋め込みコスト $0.25、1 クエリ約 0.9 円(prompt caching 適用後)。日本語特有のトークン消費・ハルシネーション対策・本番投入チェックリスト含む。
Everything you’re
wondering about.
Didn’t answer your question? Email us at contact@kunavo.com — we reply within 24 hours.
Kunavo is purpose-built for indie developers and small teams shipping production AI features. Three real differences: (1) we cover text, image and video under one bill — many aggregators are text-only; (2) Stripe-native checkout, ACH, SEPA, Apple Pay, WeChat Pay all included — no off-platform invoices; (3) full transparency on routing — we never silently swap your model to a cheaper one.
Every model is priced at roughly 30% below the provider's official list price — and bigger top-ups add a further bonus on top. You also save operationally: one contract, one invoice, one SDK, $2 starting credit, no commitment minimums. The per-1M-token price for every model is published on /pricing — easy to compare against the upstream listing anytime.
Yes. We implement the full set of OpenAI endpoints: /v1/chat/completions, /v1/embeddings, /v1/images/generations, /v1/models and /v1/video/generations. Streaming, function calling, vision and tool use all behave identically. Projects using the OpenAI SDK migrate by changing base_url — that’s it.
No. Kunavo is a pre-paid wallet. Top-ups stay in your account forever — no subscriptions, no monthly minimums, no expiration. Account closure refunds remaining balance to your original payment method.
Never. 4xx and 5xx responses are not billed. Streaming responses that disconnect mid-flight are billed only for the tokens actually delivered. Every charge is visible per-call in the usage dashboard, exportable as CSV for accounting.
Everything Stripe supports: cards (Visa, Mastercard, Amex, JCB, UnionPay), Apple Pay, Google Pay, Link, ACH, SEPA, BACS, BECS, Alipay, WeChat Pay, Klarna, Afterpay and more. Auto-recharge is opt-in. Enterprise customers can pay by invoice with Net 30 terms.
Edge gateway nodes are deployed across North America, Europe and Asia-Pacific. Stateless routing logic runs at the edge for sub-120ms P50 latency. Billing data, accounts and audit logs are stored in a primary region with multi-region replication.
Three minutes to your first call.
Sign up gets you $2 in credits — enough to put Claude, Gemini and GPT-Image through their paces. No card required.