Anthropic's own SDK is great, but the entire AI ecosystem has standardized on the OpenAI client shape — every example, every framework adapter, every "hello world" tutorial assumes you have an OpenAI(...) instance handy. Migrating your code to call Anthropic's SDK directly is a meaningful refactor.

It doesn't have to be. This post shows how to call Claude Opus 4.7, Sonnet 4.6 and Haiku 4.5 through the OpenAI SDK unchanged — by routing your calls through Kunavo. Same SDK, same types, same streaming, same tool use. The only line that changes is base_url.

The minimum change

With Python's openai package already installed, the full migration looks like this.

main.py

from openai import OpenAI

client = OpenAI(
    api_key="sk-kn-...",
    base_url="https://api.kunavo.com/v1",   # the only line that changes
)

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",              # a Claude slug, not gpt-4o
    messages=[
        {"role": "system", "content": "You are a senior platform engineer."},
        {"role": "user",   "content": "Critique this SQL migration..."},
    ],
)
print(resp.choices[0].message.content)

api_key becomes your Kunavo key (created in /app/keys). base_url points at our gateway. The model id switches from gpt-4o to a Claude slug — claude-opus-4-7, claude-sonnet-4-6, claude-haiku-4-5. The request body, the response shape, and every helper in the SDK behave exactly as they do against OpenAI.

Streaming

Streaming works identically. Kunavo forwards Anthropic's SSE chunks in OpenAI's chat.completion.chunk format, so the existing async iteration pattern works without modification.

stream.py

for chunk in client.chat.completions.create(
    model="claude-opus-4-7",
    messages=[{"role": "user", "content": "Explain Raft in 200 words."}],
    stream=True,
):
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Tool use / function calling

Anthropic's tool use protocol is semantically the same as OpenAI's function calling — they differ only at the wire level. Kunavo translates the tools array, the response tool_calls, and the follow-up tool messages in both directions. Use tool_choice="auto", "none", or a named tool — they all map.

tools.py

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather in a city",
            "parameters": {
                "type": "object",
                "properties": {
                    "city": {"type": "string"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["city"],
            },
        },
    }
]

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto",
)
print(resp.choices[0].message.tool_calls)

Vision

Claude has been multimodal since 3.5; passing an image is the standard OpenAI content: [{ type: 'text' }, { type: 'image_url' }] array. image_url.url can be an https URL or a data: base64 URI.

vision.py

resp = client.chat.completions.create(
    model="claude-sonnet-4-6",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url",
             "image_url": {"url": "https://example.com/cat.jpg"}},
        ],
    }],
)

When you do want the native Anthropic SDK

Some Anthropic-only features can't be expressed in the OpenAI shape — most importantly the cache_control directive for prompt caching, and extended thinking tokens. If you need either, switch SDKs but keep the same key: Kunavo also exposes the native /v1/messages endpoint, so Anthropic's SDK works with a single base_url change too.

anthropic_native.py

from anthropic import Anthropic

client = Anthropic(
    api_key="sk-kn-...",
    base_url="https://api.kunavo.com",     # SDK appends /v1/messages
)

resp = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=1024,
    system="You are a senior platform engineer.",
    messages=[{"role": "user", "content": "What is a hot standby?"}],
)
print(resp.content[0].text)

See the native Messages API docs for the full list of pass-through parameters, and /docs/caching for how prompt caching saves up to 90% of input cost on repeat prompts.

What you give up — and what you gain

The OpenAI-shape route is a translation, not native protocol. Two small things don't cross the boundary:

cache_control for prompt caching — use the native Messages SDK for that, or pass it via the input escape hatch.
Extended thinking tokens — accessible in the response usage object but not as a direct request parameter; flip to native if you need fine control.

What you gain is significant: one SDK across Claude, Gemini 2.5, GPT-Image, Veo 3 and 200+ other models; 30–70% under upstream official pricing depending on the model; Stripe-native billing in your local currency; transparent routing — the dashboard always shows which upstream actually served each call.

Two minutes to confirm

Sign up at kunavo.com/app/signup — a $5 top-up covers several thousand Claude calls, pay-as-you-go, and your balance never expires. Drop your base_url in, swap your model id to a Claude slug, run your existing test suite against it. If something doesn't cross cleanly, email contact@kunavo.com — we read every message.

Calling Claude with the OpenAI SDK — change one line, keep your codebase

The minimum change

Streaming

Tool use / function calling

Vision

When you do want the native Anthropic SDK

What you give up — and what you gain

Two minutes to confirm

Related deep dives

Anthropic prompt caching — cut 90% off your input bill in 30 minutes

Migrating from OpenAI to Kunavo in 10 minutes — Python, Node, LangChain, Vercel AI SDK

Veo 3 and Sora API quickstart — text-to-video and image-to-video in five minutes

Claude API en France — tarif, paiement SEPA, latence depuis Paris, RGPD