A real Gemini Managed Agents API tutorial: working Python for single-turn, multi-turn resume, save-as-managed-agent, and the billing and persistence realities the docs skip.
What is the Gemini Managed Agents API?
This Gemini Managed Agents API tutorial starts where the official quickstart should have ended: with the production realities. The Managed Agents API, announced at Google I/O 2026, exposes a general-purpose agent through a single endpoint, client.interactions.create(). You send a task, Google provisions an isolated Linux sandbox, and the agent reasons, executes Python/Node/Bash, manages files, and browses the web until the job is done.
The agent you call is antigravity-preview-05-2026, built on Gemini 3.5 Flash and running the same harness as the Antigravity IDE. It ships with three tools enabled by default: Code Execution, Google Search, and URL Context. You do not orchestrate a planner-executor loop yourself, you do not stand up a container, and you do not write tool-calling glue. That is the whole pitch of a managed agent: the agentic loop lives on Google’s side.
Every incumbent walkthrough (Google’s own docs, DataCamp, philschmid, Eigent) reproduces the same five-line Fibonacci demo and stops. None of them tells you what a fresh sandbox costs, how environment_id persistence interacts with multi-turn billing, when to inline config versus save a reusable agent, or how the Antigravity SDK and agy CLI map back to the raw Interactions API. This guide covers exactly that, with code you can run today.
If you want the conceptual backdrop first, see our explainer on what Google Antigravity is and our I/O 2026 coverage of the Managed Agents launch. This piece is the hands-on companion.

antigravity-preview-05-2026 is in public preview as of June 2026. The agent ID, generation-config limits, and the preview waiver on sandbox compute billing can all change before GA. Pin the agent ID in your code and treat anything compute-related as provisional.
How do I install and run a single-turn agent?
Install the Python SDK with pip install -U google-genai (you need google-genai 2.0.0 or later), set GEMINI_API_KEY, then call client.interactions.create() with environment=”remote” to provision a fresh sandbox. The response object hands you three things that matter: output_text (the final answer), id (the interaction handle), and environment_id (the sandbox handle you reuse for multi-turn work).
Here is a complete, runnable single-turn example. Note that I am reading both id and environment_id immediately, because you cannot resume a session without them and there is no way to look them up after the fact.
The interaction.steps list is your observability surface. Each entry is one reasoning, tool-call, or code-execution step, with stdout/stderr captured for the code steps. In production this is the only window you have into what the agent actually did inside the sandbox, so log it. When something goes wrong at turn 12, steps is the difference between a five-minute fix and a two-hour guess.
There is no list-interactions endpoint that returns a usable environment handle after the fact. If you do not persist interaction.id and interaction.environment_id from the create() response, the sandbox and all its files are effectively orphaned on your next call.
from google import genai
# Reads GEMINI_API_KEY from the environment
client = genai.Client()
interaction = client.interactions.create(
agent="antigravity-preview-05-2026",
input=(
"Write a Python script that generates the first 20 Fibonacci "
"numbers, saves them to fibonacci.txt, then reads the file back "
"and prints its contents."
),
environment="remote", # provisions a fresh Google-hosted Linux sandbox
)
# Capture these NOW — you need both to resume in the same sandbox.
print("interaction_id :", interaction.id)
print("environment_id :", interaction.environment_id)
print("output :", interaction.output_text)
# steps is your only observability surface into the sandbox
for i, step in enumerate(interaction.steps):
print(f"[step {i}] {step.type}") # reasoning | tool_call | code_execution
How does multi-turn persistence and billing actually work?
To continue in the same sandbox, pass previous_interaction_id=interaction.id and environment=interaction.environment_id on the next create() call; files and state from turn one survive into turn two. This is the build agent gemini interactions api pattern that the hello-world never shows, and it is where billing gets subtle.
Each turn re-bills the entire session context window, not just your new tokens. The session context window is the accumulated input from every prior turn plus the current one. So a 30-turn debugging session does not cost 30x a single turn, it costs the integral of a growing context. Google notes that 50-70% of input tokens are typically cached, which softens the curve, but complex workflows with many tool calls can still accumulate 3-5 million tokens in one interaction. Per Google’s guidance, a single interaction lands roughly between $0.25 and $3.25, with heavy agentic runs reaching about $5.
The one piece of good news for preview users: environment compute (CPU, memory, sandbox execution) is not billed during preview. You pay only for the underlying Gemini model tokens and the tools the agent invokes. Plan your budget assuming that changes at GA, because the sandbox is the expensive part of any managed-agent platform.
The code below resumes the Fibonacci sandbox to plot a chart from the file written in turn one. Because the file persists, the agent does not regenerate the sequence, it just reads fibonacci.txt and renders chart.png.

Cost scales with accumulated context, not turn count. Cap long sessions: when a thread crosses ~15-20 turns, summarize state into a fresh interaction rather than resuming, so you stop re-billing a bloated context window every turn.
# --- Turn 2: resume the SAME sandbox from the single-turn example ---
follow_up = client.interactions.create(
agent="antigravity-preview-05-2026",
previous_interaction_id=interaction.id, # carries conversation context
environment=interaction.environment_id, # carries sandbox FILES + state
input=(
"Read fibonacci.txt, plot the sequence as a line chart, "
"and save it as chart.png. Do NOT recompute the numbers."
),
)
print(follow_up.output_text)
# Persist the rolling handles so a later turn can resume again.
session = {
"interaction_id": follow_up.id,
"environment_id": follow_up.environment_id, # same sandbox as turn 1
}
How do I override the default tools?
Pass a tools list to create() to restrict the agent to a subset of {code_execution, google_search, url_context}; omitting the parameter enables all three. This is a security and cost control, not just a tidiness feature. An agent that cannot execute code cannot exfiltrate data through a shell, and an agent without google_search cannot rack up search-tool charges on a task that only needs to read one known URL.
In this gemini agent sandbox python example, I lock a research agent down to search and URL fetching only, with no code execution. The agent can read the web and summarize, but it physically cannot run a shell inside the sandbox.
Two limits to internalize before you design around this agent. First, generation config is mostly off-limits: temperature, top_p, top_k, stop_sequences, and max_output_tokens are unsupported, so you cannot dial determinism the usual way. Second, structured outputs are not supported, and inputs are restricted to text and images, no audio, video, or document inputs. If you need a strict JSON schema back, parse it out of output_text yourself or post-process with a separate Gemini call.
Pros
Cons
# Research-only agent: web access, NO code execution
research = client.interactions.create(
agent="antigravity-preview-05-2026",
input=(
"Find the three most-cited 2026 papers on agentic reasoning "
"and summarize each in two sentences with a source URL."
),
environment="remote",
tools=[
{"type": "google_search"},
{"type": "url_context"},
# {"type": "code_execution"}, # deliberately omitted: no shell
],
)
print(research.output_text)
How do I save a reusable managed agent with AGENTS.md and SKILL.md?
Use client.agents.create() to bake a base_agent, a system_instruction, and a base_environment that mounts AGENTS.md plus a skills repo, then invoke it later by ID like any built-in agent. This is the save as managed agent gemini workflow, and it is the dividing line between a demo and something you deploy. Inline config is fine for a script; a versioned agent ID is what an application actually calls.
The base_environment.sources array is the powerful part. You can mount inline content directly to a path like .agents/AGENTS.md (operating instructions the agent always reads) and pull an entire skills repository to .agents/skills so the agent inherits reusable SKILL.md-defined capabilities on every run. AGENTS.md is your standing policy; SKILL.md files are composable procedures. Together they replace the prompt-stuffing most people do by hand.
Once created, the agent is addressable by its id. Your application code no longer carries instructions or tool lists, it just names the agent and passes input. That separation matters: your platform team owns the agent definition and its skills, while product code stays a one-liner. It also means you can update the agent’s behavior centrally without redeploying every caller.
# 1) Define a reusable managed agent ONCE (ops / platform side)
client.agents.create(
id="report-analyst",
base_agent="antigravity-preview-05-2026",
system_instruction=(
"You are a data analysis agent. Generate sequences, visualize "
"them, and export results as a PDF report."
),
base_environment={
"type": "remote",
"sources": [
{
"type": "inline",
"target": ".agents/AGENTS.md",
"content": (
"Always include a chart and a summary table. "
"Cite every number. Never fabricate data."
),
},
{
"type": "repository",
"source": "https://github.com/your-org/skills",
"target": ".agents/skills", # SKILL.md files mounted here
},
],
},
)
# 2) Invoke it anywhere by ID (application side) — one line
result = client.interactions.create(
agent="report-analyst",
input="Generate the first 50 primes, plot their distribution, export a PDF.",
environment="remote",
)
print(result.output_text)
Treat AGENTS.md as policy and SKILL.md as procedure. Standing rules (cite sources, never fabricate, output format) belong in AGENTS.md; reusable how-to capabilities belong in versioned SKILL.md files
How do the Antigravity SDK and agy CLI map to the Interactions API?
The Antigravity SDK and the agy CLI are ergonomic wrappers over the same Interactions API you just used; agy agents init scaffolds an agent definition, agy agents test runs it against the managed harness, and agy agents create registers it, all resolving to client.agents.create() and client.interactions.create() under the hood. Announced at I/O 2026 alongside the standalone Antigravity 2.0 desktop app, the SDK ships as a Python library and agy (written in Go) replaces the old Gemini CLI.
This matters for a practical reason: Gemini CLI and the Gemini Code Assist IDE extensions stop serving requests on June 18, 2026. If your workflow depends on the old CLI, you migrate to agy. The good news is that everything you learned in this antigravity sdk python tutorial transfers directly, because the CLI is a thin layer over the raw create() calls.
My recommendation: learn the Interactions API first, then adopt the CLI for scaffolding and CI. The CLI is excellent for project setup and a google managed agents quickstart loop (init, test, create, iterate), but when you need custom billing guards, per-call tool overrides, or to inspect interaction.steps programmatically, you drop back to the SDK. The two are not competing, they are the same surface at different altitudes.
A typical agy flow looks like the block below. Conceptually, each command corresponds to an API call you have already seen in this guide.
If you are weighing Google’s stack against alternatives, our comparison of the Claude Agent SDK, OpenAI Agents SDK, and Google ADK puts the Managed Agents API in context: it is the most managed of the three, trading control for the fact that the entire agentic loop and sandbox live on Google’s infrastructure.
Gemini CLI and Gemini Code Assist IDE extensions stop serving requests on June 18, 2026. Move scripts and CI that shell out to gemini over to agy before then, or your pipelines break silently.
# Scaffold a new managed-agent project (writes AGENTS.md, SKILL.md, config)
agy agents init report-analyst
# Test the agent against the managed harness before registering it
agy agents test report-analyst --input "Summarize Q2 sales from data.csv"
# Register it as a reusable managed agent (≈ client.agents.create)
agy agents create report-analyst
# Note: the legacy Gemini CLI stops serving requests on 2026-06-18 — use agy.
Production checklist: persistence, cost, and observability
Worth it for managed sandboxes, but own your IDs and your budget
Before you ship a Gemini managed agent, lock down four things the docs gloss over: durable storage of interaction.id and environment_id, a per-session token budget, structured logging of interaction.steps, and a tool allowlist on every call. Skip any of these and the failure shows up in production, not in your demo.
Persistence: the environment_id is your sandbox handle and it is not recoverable after the create() response. Write both IDs to durable storage (Redis, Postgres, wherever your session state lives) the instant you get them, keyed by your own conversation ID. Treat them like a database connection string, not a log line.
Cost governance: because each turn re-bills the session context window, set a hard turn cap and a token ceiling per session. When you approach the cap, summarize the conversation into a single fresh interaction instead of resuming. This is the single biggest lever on your bill, and none of the hello-world tutorials mention it.
Observability and safety: log interaction.steps on every call so you can reconstruct what the agent ran inside the sandbox, and pass an explicit tools list rather than relying on the all-three default. An agent that does not need code_execution should not have it, both to cut the attack surface and to keep tool charges off tasks that only need a single URL fetch.
| Concern | What the docs show | What production needs |
|---|---|---|
| Session resume | previous_interaction_id in one script | Durable store of id + environment_id keyed by your conversation ID |
| Billing | “based on tokens” | Per-session token ceiling + turn cap; summarize-and-restart past ~15-20 turns |
| Sandbox compute | Free in preview | Budget for metered CPU/memory at GA; it is the expensive part |
| Tools | All three enabled by default | Explicit allowlist per call to cut cost and attack surface |
| Observability | print(interaction.steps) | Structured logging of every step for post-hoc debugging |
| Output shape | output_text print | No structured outputs supported; parse JSON out of text yourself |
Builder’s take
I build agent orchestration for a living at Cyntr, so I read every managed-agent launch through one lens: what does it cost, and what breaks at turn 50? Here is what the tutorials won’t tell you about Gemini’s Managed Agents.
- The interaction is cheap; the environment is the moat. Whoever owns environment_id owns your state, your latency, and eventually your bill once preview compute starts metering.
- Token accumulation is the silent killer. A 30-turn session re-bills the whole session context window each turn, so a chatty agent can quietly hit 3-5M tokens before anyone notices.
- save-as-managed-agent is the only thing that scales past a demo. Inline config is fine for a script; a versioned agent ID with mounted AGENTS.md is what you actually deploy.
- The agy CLI and Antigravity SDK are sugar over the same Interactions API. Learn the raw create() call first and the CLI becomes obvious; learn the CLI first and you’ll be lost the moment you need custom billing controls.
Frequently asked questions
The general-purpose managed agent is antigravity-preview-05-2026, built on Gemini 3.5 Flash. You pass it as the agent argument to client.interactions.create(). It can reason, execute Python/Node/Bash, manage files, and browse the web inside a Google-hosted Linux sandbox, with Code Execution, Google Search, and URL Context enabled by default.
Per Google’s guidance, a single interaction typically costs between $0.25 and $3.25 depending on complexity, with heavy agentic runs reaching about $5 when they accumulate 3-5 million tokens. You pay for Gemini model tokens and tool usage; 50-70% of input tokens are usually cached. Sandbox compute (CPU, memory, execution) is not billed during the preview period.
Store interaction.id and interaction.environment_id from the create() response, then pass previous_interaction_id and environment back on the next call. The previous_interaction_id carries conversation context and the environment_id carries the sandbox’s files and state, so work from earlier turns persists. There is no way to recover these IDs later, so save them immediately.
Pass a tools list to create(), for example tools=[{“type”: “google_search”}, {“type”: “url_context”}]. Omitting the parameter enables all three defaults: code_execution, google_search, and url_context. Restricting tools is both a cost control and a security control, since an agent without code_execution cannot run a shell in the sandbox.
Inline config passes tools and input on each create() call and suits one-off scripts. A saved managed agent, created with client.agents.create(), bakes in a base_agent, system_instruction, and a base_environment that can mount AGENTS.md and a skills repo. You then invoke it by ID, so application code stays a one-liner while ops owns the agent definition and versioning.
The Antigravity SDK and the agy CLI are wrappers over the same Interactions API. Commands like agy agents init, agy agents test, and agy agents create resolve to client.agents.create() and client.interactions.create() under the hood. Learn the raw API first; the CLI is best for scaffolding and CI. Note that the legacy Gemini CLI stops serving requests on June 18, 2026.
Primary sources
- Managed Agents Quickstart — Google AI for Developers
- Antigravity Agent reference — Google AI for Developers
- Introducing Managed Agents in the Gemini API — Google
- Transitioning Gemini CLI to Antigravity CLI — Google Developers Blog
- I/O ’26 news for agent developers on Google Cloud — Google Cloud Blog
- Gemini Developer API pricing — Google AI for Developers
Last updated: June 6, 2026. Related: Agent Infrastructure.