We built the same agent in all three first-party SDKs, then scored each on the only question that matters in 18 months: when you switch models, which SDK survives?
Claude Agent SDK vs OpenAI Agents SDK vs Google ADK: the short answer
In the Claude Agent SDK vs OpenAI Agents SDK vs Google ADK decision, the question that survives 18 months isn’t “which model is smartest” — it’s “which SDK survives a model swap.” Google ADK survives a model change with a one-line edit; the Claude Agent SDK doesn’t survive one at all (it’s Claude-only); OpenAI’s Agents SDK survives the swap but makes you rebuild your memory layer. Every other ranking page tells you Claude wins reasoning, OpenAI wins developer experience, and ADK wins cost. That’s accurate and nearly useless, because frontier models trade the lead every quarter. Your harness is the part that compounds.
This comparison does the thing the incumbent pages skip: it treats each first-party agent SDK as an architectural commitment, not a model wrapper. We map the three on the four properties that actually create or prevent lock-in — model agnosticism, native session persistence, native sandbox execution, and host coupling — and reconcile the moves that reshaped the field in 2026: OpenAI’s April 15 Agents SDK overhaul, the Claude Agent SDK’s session fork/resume primitives, and Google ADK’s genuine multi-provider model layer.
If you only read one section, read the lock-in scorecard and the decision rule below. Everything else is the evidence behind the cells.

We specced one reference agent — read a repo, run a test in an isolated sandbox, hand off a subtask to a research subagent, then resume the same session a day later — and asked what each SDK forces you to build yourself. The scorecard records native / partial / none for each property, with the real mechanism behind every cell. No fabricated benchmark numbers: the architecture IS the asset.
The lock-in scorecard: which agent SDK survives a model switch?
Read this grid as a risk map. “Native” means the SDK ships it; “partial” means it’s possible but you wire it; “none” means it’s structurally off the table. The single most consequential row is the first one — model-agnostic — because it decides whether changing your model in 2027 is a config edit or a rewrite.
Google ADK is the only SDK that earns “native” on model agnosticism: it talks to Gemini and Anthropic Claude directly, reaches OpenAI, Ollama and 100+ providers through LiteLLM, and serves self-hosted open weights through vLLM, per the ADK model docs. The Claude Agent SDK is Claude-only by design — its host flexibility (Anthropic API, Amazon Bedrock, Google Vertex AI, Microsoft Azure AI Foundry) is about where Claude runs, not which model runs. OpenAI’s Agents SDK is provider-agnostic at the model layer (100+ LLMs) but the rest of its harness — sandbox, FS tools, the Responses-style loop — is tuned for OpenAI models first.
The second decisive row is native session persistence. The Claude Agent SDK is the only one of the three that ships resumable and forkable sessions out of the box: capture a session ID, then resume to continue with full context, or fork to branch a transcript and explore a different approach — state is written as JSONL on your filesystem. OpenAI’s state is ephemeral by default; each run starts fresh, so teams bolt on an external memory store. ADK offers checkpoint-style state restoration but it’s a feature you configure, not a one-call primitive.
“Pick the SDK whose lock-in you can afford to keep — not the model that happens to be #1 this month.”
The lock-in test
| SDK | Model-agnostic | Native session persistence | Native sandbox | Host coupling | MCP |
|---|---|---|---|---|---|
| Claude Agent SDK | none — Claude-only (Anthropic models) | native — resume + fork via session ID (JSONL on disk) | partial — Managed Agents gives a hosted sandbox; SDK itself runs on your infra | low–medium — Anthropic API, Bedrock, Vertex AI, Azure AI Foundry via env flags | native — first-class, deepest server ecosystem |
| OpenAI Agents SDK | native — provider-agnostic, 100+ LLMs | none — ephemeral by default; bring external memory | native (Apr 15 2026) — Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel | low — runs anywhere; OpenAI models are the happy path | native — first-class as of the 2026 harness |
| Google ADK | native — Gemini + Claude direct, LiteLLM (OpenAI/Ollama/100+), vLLM self-host | partial — checkpoint/state restore you configure | partial — code execution + Vertex AI Agent Engine, not a 7-provider sandbox menu | medium — open-source, but Vertex Agent Engine is the optimized deploy target | native — supported across the ecosystem |
Is Google ADK model-agnostic — and does that make it the safe pick?
Yes — Google ADK is genuinely model-agnostic, and that is its single strongest argument in the Claude Agent SDK vs OpenAI Agents SDK vs Google ADK race. ADK is “optimized for Gemini, not locked to it”: it integrates Gemini and Anthropic Claude as direct, registry-named models, routes OpenAI, Ollama and 100+ providers through LiteLLM, and runs self-hosted open weights via vLLM. Swapping the model is a one-line change to the model identifier, not a re-architecture.
That portability is the whole point if your worry is provider risk. With ADK you can start on Gemini for cost, A/B a task against Claude for reasoning, and fall back to a self-hosted Llama or Qwen on vLLM for data-residency or air-gapped deployments — without touching your agent graph. ADK 2.0, which shipped its graph engine in April 2026, adds deterministic node-to-node routing so you can branch with if-this-then-that logic instead of asking an LLM to pick the path. And ADK is the multi-language option: Python, Java, and Go are first-class (with TypeScript and, new in 2.0, Kotlin), which matters if your platform team lives on the JVM.
The catch — and the reason “model-agnostic” isn’t automatically “winner” — is that agnosticism has a tax. Non-Gemini models route through adapters, so you inherit LiteLLM’s quirks and a happy-path that’s still shaped like Gemini. The deepest deploy integration is Vertex AI Agent Engine, so “runs anywhere” quietly becomes “runs best on Google Cloud.” Agnosticism buys you exit rights; it doesn’t buy you the most polished single-model loop. If your bet is that one model’s reasoning is your moat, you may not want to pay the abstraction tax for a freedom you’ll never exercise.
ADK is the only one of the three where changing models is a config edit. If “we might switch providers in 18 months” is a real line item in your risk register, that property is worth more than any curWhat changed in the OpenAI Agents SDK on April 15, 2026?
On April 15, 2026, OpenAI shipped what it called “the next evolution of the Agents SDK” — its largest overhaul since launch — turning a lightweight handoff framework into a production harness. The headline additions: native sandbox execution, subagents, Codex-style filesystem tools (apply_patch and a shell tool), an AGENTS.md instruction convention, progressive-disclosure skills, a forthcoming code mode, and first-class MCP support. The catch every team needs to plan around: the new harness, sandbox, and subagents launched in Python first, with TypeScript support planned for a later release.
The sandbox change is the big one. Instead of stitching together your own isolated execution layer, the SDK now runs agent code in a controlled environment out of the box, with built-in support for seven providers — Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel — plus a bring-your-own-sandbox interface. A run can use one sandbox or many, invoke them only when needed, route subagents into isolated environments, and parallelize work across containers. The runtime handles context scoping, memory isolation between subagents, and failure propagation, so you delegate research, refactoring, or testing without writing your own orchestration loop.
What didn’t change is the state model. OpenAI’s Agents SDK is still ephemeral by default — each run starts fresh — so for memory across runs you add an external layer (a vector DB or key-value store). The 2026 harness made OpenAI’s developer-experience lead real for serious workloads, but persistence remains a thing you own, not a primitive you call. That’s the precise mirror image of the Claude Agent SDK.
If your stack is TypeScript-first, the April 2026 sandbox + subagents arrived in Python before TS. Code mode and subagents are slated for both languages, but confirm current status before committing a TS production timeline.
# OpenAI Agents SDK (Apr 2026 harness): native sandbox + a subagent.
# Sandbox execution and subagents shipped Python-first; TS follows later.
from agents import Agent, Runner
from agents.sandbox import E2BSandbox # also: Modal, Daytona, Cloudflare, Vercel, Runloop, Blaxel
researcher = Agent(
name="researcher",
instructions="Investigate the failing test and summarize the root cause.",
)
fixer = Agent(
name="fixer",
instructions="Apply a minimal patch using apply_patch, then run the test in the sandbox.",
# Codex-style FS tools (apply_patch, shell) run inside the sandbox
sandbox=E2BSandbox(),
handoffs=[researcher], # delegate to an isolated subagent when needed
)
result = Runner.run_sync(fixer, "Fix the failing test in test_auth.py")
print(result.final_output)
# State is EPHEMERAL by default: the next run starts fresh.
# Persist context yourself (e.g. write result + transcript to your own store).
Claude Agent SDK review: session fork/resume is the killer feature
The Claude Agent SDK’s standout — and the reason to choose it despite being Claude-only — is first-class session persistence. You capture a session ID from the first query and either resume it later to continue with full context, or fork it to branch the transcript and try a different approach in parallel. State is written as JSONL on your own filesystem, which makes long-horizon tasks, human-in-the-loop pauses, and “explore two solutions side by side” trivial instead of bespoke. Forking copies a session’s transcript into a new file with freshly remapped message IDs while the original stays untouched — that’s the primitive behind every “try another way” feature you’d otherwise build by hand.
Beyond sessions, the Claude Agent SDK gives you the same agent loop, context management, and built-in tools (Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch) that power Claude Code, programmable in Python and TypeScript. Subagents are native — define a specialized agent and Claude invokes it via the Agent tool, with each subagent’s messages tagged by a parent_tool_use_id so you can trace which work belongs to which delegate. MCP is native and the ecosystem is the deepest of the three. Hooks (PreToolUse, PostToolUse, SessionStart, and more) let you log, block, or transform behavior — handy for audit trails and policy enforcement.
On host coupling, the Claude Agent SDK is more flexible than its model lock-in suggests: the same agent code runs against the Anthropic API, Amazon Bedrock (CLAUDE_CODE_USE_BEDROCK=1), Google Vertex AI (CLAUDE_CODE_USE_VERTEX=1), or Microsoft Azure AI Foundry (CLAUDE_CODE_USE_FOUNDRY=1) by flipping an environment variable. And if you don’t want to operate sandbox or session infrastructure at all, Anthropic’s Managed Agents is a hosted REST alternative where Anthropic runs the agent and a per-session sandbox for you. Two cautions for 2026: it is Claude-only at the model layer, full stop; and from June 15, 2026, Agent SDK usage on subscription plans draws from a new monthly Agent SDK credit separate from interactive limits — budget accordingly.
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage, ResultMessage
async def main():
session_id = None
# First query: capture the session ID
async for message in query(
prompt="Read the authentication module",
options=ClaudeAgentOptions(allowed_tools=["Read", "Glob"]),
):
if isinstance(message, SystemMessage) and message.subtype == "init":
session_id = message.data["session_id"]
# Resume LATER with full context -- native, no external memory store
async for message in query(
prompt="Now find every place that calls it",
options=ClaudeAgentOptions(resume=session_id),
):
if isinstance(message, ResultMessage):
print(message.result)
# Host swap is an env flag, not a rewrite:
# export CLAUDE_CODE_USE_BEDROCK=1 # Amazon Bedrock
# export CLAUDE_CODE_USE_VERTEX=1 # Google Vertex AI
# export CLAUDE_CODE_USE_FOUNDRY=1 # Azure AI Foundry
asyncio.run(main())
Same agent, three SDKs: the lines-of-code and lock-in diff
Build the same reference agent — read a repo, run a test in a sandbox, hand off to a research subagent, resume the session tomorrow — and the SDKs diverge not on how much code you write today but on what you’re forced to own forever. The Claude Agent SDK writes the fewest lines for a persistent, stateful agent because sessions and subagents are primitives. OpenAI writes the fewest lines for sandboxed code execution because seven providers are built in, but you add code for memory. ADK writes the most glue but the least re-architecture risk because the model is swappable.
Concretely: on the Claude Agent SDK, “resume tomorrow” is one option (resume=session_id) and “try two approaches” is a fork — near-zero added code, total model lock-in. On OpenAI’s harness, sandboxed execution and a subagent handoff are a few lines thanks to the April 2026 primitives, but persistence is an external store you build and maintain. On ADK, you wire LiteLLM or a direct model string and lean on configured checkpoints, so there’s more setup — yet that same indirection is exactly what lets you change the model later without touching the agent.
Read the trade as three different bills. The Claude Agent SDK bills you in provider lock-in. OpenAI bills you in a memory layer you operate. ADK bills you in glue and a Gemini-shaped happy path. There is no free option — only the bill your team is best equipped to pay.
Pros
Cons
Which agent SDK should you use? The one-screen decision rule
Verdict: there is no single winner — there’s the lock-in you can afford
Use this rule. If a single model’s reasoning is your product moat and you’ll never switch providers, choose the Claude Agent SDK — you get the best agent loop and real session fork/resume, and you accept Claude lock-in on purpose. If you need provider flexibility, multi-language support, or air-gapped/self-hosted models, choose Google ADK — it’s the only one where a model swap is a config edit. If your top priority is sandboxed code execution, rapid prototyping, and developer experience and you’re fine owning the memory layer, choose the OpenAI Agents SDK and its April 2026 harness.
The cleanest way to internalize the choice is the lock-in question, asked literally: “If we switch models in 18 months, what breaks?” With the Claude Agent SDK, everything breaks — you’re rebuilding on a different SDK, because it’s Claude-only. With OpenAI’s Agents SDK, the model swaps cleanly but your bolted-on memory may need rework. With Google ADK, you change one model string and your agent graph keeps running. If you can’t answer that question for your stack, you haven’t chosen an architecture — you’ve chosen a vendor.
For most teams in 2026, the honest pattern is a small abstraction over whichever SDK you pick, so the model is a swappable dependency behind an interface. ADK gives you that abstraction for free; with Claude or OpenAI you build a thin one yourself. Either way, the SDK is the bet that compounds — choose the lock-in you can live with, then let the models race underneath it.
Claude Agent SDK
Best for: Deep OS-level / coding agents where Claude reasoning is the moat and long-horizon stateful tasks matter.
What works
Watch out for
OpenAI Agents SDK
Best for: Sandboxed code execution, rapid prototyping, and teams that want to swap LLMs freely at the model layer.
What works
Watch out for
Google ADK
Best for: Provider flexibility, JVM/Go teams, self-hosted/air-gapped models, and Google Cloud / Vertex deployments.
What works
Watch out for
Builder’s take
I’ve shipped agents on all three. The benchmark everyone quotes (“Claude reasons best, OpenAI has the nicest DX, ADK is cheapest”) is true and almost useless, because models leapfrog each other every quarter. The decision that actually compounds is architectural: what happens to your codebase the day you change models. Here’s how I’d choose as the founder of Cyntr and Loomfeed, where the agent layer has to outlive any single provider.
- Pick the SDK whose lock-in you can afford to keep, not the model that’s #1 this month — models churn, your agent harness shouldn’t.
- Claude Agent SDK buys you the best out-of-the-box agent loop and real session fork/resume, at the cost of being Claude-only forever. That’s a fine trade if Claude reasoning is your moat.
- Google ADK is the only one of the three where switching the underlying model is a one-line config change — it’s the insurance policy, and the price you pay is more glue and a Gemini-shaped happy path.
- OpenAI’s April 2026 overhaul finally made the Agents SDK a serious production harness (native sandbox, subagents, Codex FS tools), but ephemeral-by-default state means you still own the memory layer.
- At Cyntr I treat the model as a swappable dependency behind an interface — if your team can’t say which SDK survives a model swap, you don’t have an architecture, you have a vendor.
Frequently asked questions
There’s no universal winner. Choose the Claude Agent SDK if a single model’s reasoning is your moat and you want native session fork/resume; choose the OpenAI Agents SDK for the best sandbox and developer experience if you’ll own the memory layer; choose Google ADK if you need to keep the option of switching models or self-hosting. Decide on lock-in, not on this quarter’s benchmark.
Yes. Google ADK is genuinely model-agnostic: it integrates Gemini and Anthropic Claude directly, reaches OpenAI, Ollama and 100+ providers via LiteLLM, and runs self-hosted open weights through vLLM. It’s optimized for Gemini but not locked to it, so changing the model is a one-line config edit — the strongest anti-lock-in property of the three SDKs.
On April 15, 2026, OpenAI shipped ‘the next evolution of the Agents SDK’: native sandbox execution (Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, Vercel), subagents, Codex-style filesystem tools (apply_patch and a shell tool), an AGENTS.md instruction convention, progressive-disclosure skills, a forthcoming code mode, and first-class MCP. The harness, sandbox, and subagents launched Python-first, with TypeScript planned for later.
No. The Claude Agent SDK is Claude-only at the model layer. Its flexibility is in where Claude runs — Anthropic API, Amazon Bedrock, Google Vertex AI, or Microsoft Azure AI Foundry via environment-variable flags — not in which model you run. If you need to swap to a non-Anthropic model later, you’d be moving to a different SDK.
The Claude Agent SDK is the only one of the three with native session persistence: you resume a session by ID to continue with full context, or fork it to branch the transcript, with state stored as JSONL on your filesystem. OpenAI’s Agents SDK is ephemeral by default (you add external memory), and Google ADK offers checkpoint-style state you configure rather than a one-call primitive.
Google ADK survives a model switch with a one-line change because it’s model-agnostic. The OpenAI Agents SDK survives at the model layer but you may have to rework the external memory you built. The Claude Agent SDK does not survive a model switch — it’s Claude-only, so a different model means a different SDK. Ask this question literally before committing a stack.
Primary sources
- The next evolution of the Agents SDK — OpenAI
- OpenAI updates its Agents SDK to help enterprises build safer, more capable agents — TechCrunch
- Agent SDK overview (sessions, resume/fork, Bedrock/Vertex/Foundry, subagents, MCP) — Anthropic / Claude Docs
- AI Models for ADK agents (Gemini, Claude, LiteLLM, vLLM; Python/Java/Go) — Google ADK Docs
- Claude Agents SDK vs OpenAI Agents SDK vs Google ADK — Composio
- OpenAI Agents SDK 2026: Native Sandbox and Subagents — AI Automation Global
Last updated: June 3, 2026. Related: Agent Infrastructure.