Gemini Managed Agents API Tutorial: Production Python Guide

A real Gemini Managed Agents API tutorial: working Python for single-turn, multi-turn resume, save-as-managed-agent, and the billing and persistence realities the docs skip.

Contents

What is the Gemini Managed Agents API?

This Gemini Managed Agents API tutorial starts where the official quickstart should have ended: with the production realities. The Managed Agents API, announced at Google I/O 2026, exposes a general-purpose agent through a single endpoint, client.interactions.create(). You send a task, Google provisions an isolated Linux sandbox, and the agent reasons, executes Python/Node/Bash, manages files, and browses the web until the job is done.

The agent you call is antigravity-preview-05-2026, built on Gemini 3.5 Flash and running the same harness as the Antigravity IDE. It ships with three tools enabled by default: Code Execution, Google Search, and URL Context. You do not orchestrate a planner-executor loop yourself, you do not stand up a container, and you do not write tool-calling glue. That is the whole pitch of a managed agent: the agentic loop lives on Google’s side.

Every incumbent walkthrough (Google’s own docs, DataCamp, philschmid, Eigent) reproduces the same five-line Fibonacci demo and stops. None of them tells you what a fresh sandbox costs, how environment_id persistence interacts with multi-turn billing, when to inline config versus save a reusable agent, or how the Antigravity SDK and agy CLI map back to the raw Interactions API. This guide covers exactly that, with code you can run today.

If you want the conceptual backdrop first, see our explainer on what Google Antigravity is and our I/O 2026 coverage of the Managed Agents launch. This piece is the hands-on companion.

Python code calling the Gemini Managed Agents API to provision a remote Antigravity sandbox — Image.

antigravity-preview-05-2026 is in public preview as of June 2026. The agent ID, generation-config limits, and the preview waiver on sandbox compute billing can all change before GA. Pin the agent ID in your code and treat anything compute-related as provisional.

How do I install and run a single-turn agent?

Install the Python SDK with pip install -U google-genai (you need google-genai 2.0.0 or later), set GEMINI_API_KEY, then call client.interactions.create() with environment=”remote” to provision a fresh sandbox. The response object hands you three things that matter: output_text (the final answer), id (the interaction handle), and environment_id (the sandbox handle you reuse for multi-turn work).

Here is a complete, runnable single-turn example. Note that I am reading both id and environment_id immediately, because you cannot resume a session without them and there is no way to look them up after the fact.

The interaction.steps list is your observability surface. Each entry is one reasoning, tool-call, or code-execution step, with stdout/stderr captured for the code steps. In production this is the only window you have into what the agent actually did inside the sandbox, so log it. When something goes wrong at turn 12, steps is the difference between a five-minute fix and a two-hour guess.

There is no list-interactions endpoint that returns a usable environment handle after the fact. If you do not persist interaction.id and interaction.environment_id from the create() response, the sandbox and all its files are effectively orphaned on your next call.

from google import genai

# Reads GEMINI_API_KEY from the environment
client = genai.Client()

interaction = client.interactions.create(
    agent="antigravity-preview-05-2026",
    input=(
        "Write a Python script that generates the first 20 Fibonacci "
        "numbers, saves them to fibonacci.txt, then reads the file back "
        "and prints its contents."
    ),
    environment="remote",  # provisions a fresh Google-hosted Linux sandbox
)

# Capture these NOW — you need both to resume in the same sandbox.
print("interaction_id :", interaction.id)
print("environment_id :", interaction.environment_id)
print("output         :", interaction.output_text)

# steps is your only observability surface into the sandbox
for i, step in enumerate(interaction.steps):
    print(f"[step {i}] {step.type}")  # reasoning | tool_call | code_execution

How does multi-turn persistence and billing actually work?

To continue in the same sandbox, pass previous_interaction_id=interaction.id and environment=interaction.environment_id on the next create() call; files and state from turn one survive into turn two. This is the build agent gemini interactions api pattern that the hello-world never shows, and it is where billing gets subtle.

Each turn re-bills the entire session context window, not just your new tokens. The session context window is the accumulated input from every prior turn plus the current one. So a 30-turn debugging session does not cost 30x a single turn, it costs the integral of a growing context. Google notes that 50-70% of input tokens are typically cached, which softens the curve, but complex workflows with many tool calls can still accumulate 3-5 million tokens in one interaction. Per Google’s guidance, a single interaction lands roughly between $0.25 and $3.25, with heavy agentic runs reaching about $5.

The one piece of good news for preview users: environment compute (CPU, memory, sandbox execution) is not billed during preview. You pay only for the underlying Gemini model tokens and the tools the agent invokes. Plan your budget assuming that changes at GA, because the sandbox is the expensive part of any managed-agent platform.

The code below resumes the Fibonacci sandbox to plot a chart from the file written in turn one. Because the file persists, the agent does not regenerate the sequence, it just reads fibonacci.txt and renders chart.png.

Estimated cost per Managed Agents interaction — Sandbox compute is waived during preview; figures reflect Gemini 3.5 Flash token costs and tool usage only. Expect the curve to shift up once environment compute is metered at GA.

Cost scales with accumulated context, not turn count. Cap long sessions: when a thread crosses ~15-20 turns, summarize state into a fresh interaction rather than resuming, so you stop re-billing a bloated context window every turn.

# --- Turn 2: resume the SAME sandbox from the single-turn example ---
follow_up = client.interactions.create(
    agent="antigravity-preview-05-2026",
    previous_interaction_id=interaction.id,        # carries conversation context
    environment=interaction.environment_id,        # carries sandbox FILES + state
    input=(
        "Read fibonacci.txt, plot the sequence as a line chart, "
        "and save it as chart.png. Do NOT recompute the numbers."
    ),
)

print(follow_up.output_text)

# Persist the rolling handles so a later turn can resume again.
session = {
    "interaction_id": follow_up.id,
    "environment_id": follow_up.environment_id,  # same sandbox as turn 1
}

How do I override the default tools?

Pass a tools list to create() to restrict the agent to a subset of {code_execution, google_search, url_context}; omitting the parameter enables all three. This is a security and cost control, not just a tidiness feature. An agent that cannot execute code cannot exfiltrate data through a shell, and an agent without google_search cannot rack up search-tool charges on a task that only needs to read one known URL.

In this gemini agent sandbox python example, I lock a research agent down to search and URL fetching only, with no code execution. The agent can read the web and summarize, but it physically cannot run a shell inside the sandbox.

Two limits to internalize before you design around this agent. First, generation config is mostly off-limits: temperature, top_p, top_k, stop_sequences, and max_output_tokens are unsupported, so you cannot dial determinism the usual way. Second, structured outputs are not supported, and inputs are restricted to text and images, no audio, video, or document inputs. If you need a strict JSON schema back, parse it out of output_text yourself or post-process with a separate Gemini call.

Pros

Inline config (tools=, input=) is perfect for one-off scripts and experiments
Saved agents give you a stable ID, versioned system_instruction, and mounted skills
Saved agents let ops own the agent definition while app code just calls the ID
Inline keeps everything in one file with zero setup ceremony

Cons

Inline config drifts: every caller can pass different tools and instructions
Inline cannot mount AGENTS.md or a skills repo as a base environment
Saved agents add a create/update lifecycle you must manage and version
Saved agents can hide cost-relevant tool choices from the calling code

# Research-only agent: web access, NO code execution
research = client.interactions.create(
    agent="antigravity-preview-05-2026",
    input=(
        "Find the three most-cited 2026 papers on agentic reasoning "
        "and summarize each in two sentences with a source URL."
    ),
    environment="remote",
    tools=[
        {"type": "google_search"},
        {"type": "url_context"},
        # {"type": "code_execution"},  # deliberately omitted: no shell
    ],
)

print(research.output_text)

How do I save a reusable managed agent with AGENTS.md and SKILL.md?

Use client.agents.create() to bake a base_agent, a system_instruction, and a base_environment that mounts AGENTS.md plus a skills repo, then invoke it later by ID like any built-in agent. This is the save as managed agent gemini workflow, and it is the dividing line between a demo and something you deploy. Inline config is fine for a script; a versioned agent ID is what an application actually calls.

The base_environment.sources array is the powerful part. You can mount inline content directly to a path like .agents/AGENTS.md (operating instructions the agent always reads) and pull an entire skills repository to .agents/skills so the agent inherits reusable SKILL.md-defined capabilities on every run. AGENTS.md is your standing policy; SKILL.md files are composable procedures. Together they replace the prompt-stuffing most people do by hand.

Once created, the agent is addressable by its id. Your application code no longer carries instructions or tool lists, it just names the agent and passes input. That separation matters: your platform team owns the agent definition and its skills, while product code stays a one-liner. It also means you can update the agent’s behavior centrally without redeploying every caller.

# 1) Define a reusable managed agent ONCE (ops / platform side)
client.agents.create(
    id="report-analyst",
    base_agent="antigravity-preview-05-2026",
    system_instruction=(
        "You are a data analysis agent. Generate sequences, visualize "
        "them, and export results as a PDF report."
    ),
    base_environment={
        "type": "remote",
        "sources": [
            {
                "type": "inline",
                "target": ".agents/AGENTS.md",
                "content": (
                    "Always include a chart and a summary table. "
                    "Cite every number. Never fabricate data."
                ),
            },
            {
                "type": "repository",
                "source": "https://github.com/your-org/skills",
                "target": ".agents/skills",  # SKILL.md files mounted here
            },
        ],
    },
)

# 2) Invoke it anywhere by ID (application side) — one line
result = client.interactions.create(
    agent="report-analyst",
    input="Generate the first 50 primes, plot their distribution, export a PDF.",
    environment="remote",
)
print(result.output_text)

Treat AGENTS.md as policy and SKILL.md as procedure. Standing rules (cite sources, never fabricate, output format) belong in AGENTS.md; reusable how-to capabilities belong in versioned SKILL.md files

How do the Antigravity SDK and agy CLI map to the Interactions API?

The Antigravity SDK and the agy CLI are ergonomic wrappers over the same Interactions API you just used; agy agents init scaffolds an agent definition, agy agents test runs it against the managed harness, and agy agents create registers it, all resolving to client.agents.create() and client.interactions.create() under the hood. Announced at I/O 2026 alongside the standalone Antigravity 2.0 desktop app, the SDK ships as a Python library and agy (written in Go) replaces the old Gemini CLI.

This matters for a practical reason: Gemini CLI and the Gemini Code Assist IDE extensions stop serving requests on June 18, 2026. If your workflow depends on the old CLI, you migrate to agy. The good news is that everything you learned in this antigravity sdk python tutorial transfers directly, because the CLI is a thin layer over the raw create() calls.

My recommendation: learn the Interactions API first, then adopt the CLI for scaffolding and CI. The CLI is excellent for project setup and a google managed agents quickstart loop (init, test, create, iterate), but when you need custom billing guards, per-call tool overrides, or to inspect interaction.steps programmatically, you drop back to the SDK. The two are not competing, they are the same surface at different altitudes.

A typical agy flow looks like the block below. Conceptually, each command corresponds to an API call you have already seen in this guide.

If you are weighing Google’s stack against alternatives, our comparison of the Claude Agent SDK, OpenAI Agents SDK, and Google ADK puts the Managed Agents API in context: it is the most managed of the three, trading control for the fact that the entire agentic loop and sandbox live on Google’s infrastructure.

Gemini CLI and Gemini Code Assist IDE extensions stop serving requests on June 18, 2026. Move scripts and CI that shell out to gemini over to agy before then, or your pipelines break silently.

# Scaffold a new managed-agent project (writes AGENTS.md, SKILL.md, config)
agy agents init report-analyst

# Test the agent against the managed harness before registering it
agy agents test report-analyst --input "Summarize Q2 sales from data.csv"

# Register it as a reusable managed agent (≈ client.agents.create)
agy agents create report-analyst

# Note: the legacy Gemini CLI stops serving requests on 2026-06-18 — use agy.

Production checklist: persistence, cost, and observability

Worth it for managed sandboxes, but own your IDs and your budget

The Gemini Managed Agents API removes the hardest parts of running an agent: the sandbox, the agentic loop, and the tool glue. For most teams that is a real win over standing up containers and a planner-executor by hand. The catches are entirely operational, not technical: persist environment_id or lose your session, cap accumulated tokens or watch the bill climb, and assume sandbox compute gets metered at GA. Nail those four production realities and antigravity-preview-05-2026 is the fastest path from idea to a running agent in Python today.

Before you ship a Gemini managed agent, lock down four things the docs gloss over: durable storage of interaction.id and environment_id, a per-session token budget, structured logging of interaction.steps, and a tool allowlist on every call. Skip any of these and the failure shows up in production, not in your demo.

Persistence: the environment_id is your sandbox handle and it is not recoverable after the create() response. Write both IDs to durable storage (Redis, Postgres, wherever your session state lives) the instant you get them, keyed by your own conversation ID. Treat them like a database connection string, not a log line.

Cost governance: because each turn re-bills the session context window, set a hard turn cap and a token ceiling per session. When you approach the cap, summarize the conversation into a single fresh interaction instead of resuming. This is the single biggest lever on your bill, and none of the hello-world tutorials mention it.

Observability and safety: log interaction.steps on every call so you can reconstruct what the agent ran inside the sandbox, and pass an explicit tools list rather than relying on the all-three default. An agent that does not need code_execution should not have it, both to cut the attack surface and to keep tool charges off tasks that only need a single URL fetch.

Concern	What the docs show	What production needs
Session resume	previous_interaction_id in one script	Durable store of id + environment_id keyed by your conversation ID
Billing	“based on tokens”	Per-session token ceiling + turn cap; summarize-and-restart past ~15-20 turns
Sandbox compute	Free in preview	Budget for metered CPU/memory at GA; it is the expensive part
Tools	All three enabled by default	Explicit allowlist per call to cut cost and attack surface
Observability	print(interaction.steps)	Structured logging of every step for post-hoc debugging
Output shape	output_text print	No structured outputs supported; parse JSON out of text yourself

Production realities the incumbent tutorials skip

Builder’s take

I build agent orchestration for a living at Cyntr, so I read every managed-agent launch through one lens: what does it cost, and what breaks at turn 50? Here is what the tutorials won’t tell you about Gemini’s Managed Agents.

The interaction is cheap; the environment is the moat. Whoever owns environment_id owns your state, your latency, and eventually your bill once preview compute starts metering.
Token accumulation is the silent killer. A 30-turn session re-bills the whole session context window each turn, so a chatty agent can quietly hit 3-5M tokens before anyone notices.
save-as-managed-agent is the only thing that scales past a demo. Inline config is fine for a script; a versioned agent ID with mounted AGENTS.md is what you actually deploy.
The agy CLI and Antigravity SDK are sugar over the same Interactions API. Learn the raw create() call first and the CLI becomes obvious; learn the CLI first and you’ll be lost the moment you need custom billing controls.

Frequently asked questions

What is the agent ID for the Gemini Managed Agents API?

The general-purpose managed agent is antigravity-preview-05-2026, built on Gemini 3.5 Flash. You pass it as the agent argument to client.interactions.create(). It can reason, execute Python/Node/Bash, manage files, and browse the web inside a Google-hosted Linux sandbox, with Code Execution, Google Search, and URL Context enabled by default.

How much does a Gemini Managed Agents API interaction cost?

Per Google’s guidance, a single interaction typically costs between $0.25 and $3.25 depending on complexity, with heavy agentic runs reaching about $5 when they accumulate 3-5 million tokens. You pay for Gemini model tokens and tool usage; 50-70% of input tokens are usually cached. Sandbox compute (CPU, memory, execution) is not billed during the preview period.

How do I keep a multi-turn session in the same sandbox?

Store interaction.id and interaction.environment_id from the create() response, then pass previous_interaction_id and environment back on the next call. The previous_interaction_id carries conversation context and the environment_id carries the sandbox’s files and state, so work from earlier turns persists. There is no way to recover these IDs later, so save them immediately.

How do I override the default tools in a managed agent?

Pass a tools list to create(), for example tools=[{“type”: “google_search”}, {“type”: “url_context”}]. Omitting the parameter enables all three defaults: code_execution, google_search, and url_context. Restricting tools is both a cost control and a security control, since an agent without code_execution cannot run a shell in the sandbox.

What is the difference between inline config and a saved managed agent?

Inline config passes tools and input on each create() call and suits one-off scripts. A saved managed agent, created with client.agents.create(), bakes in a base_agent, system_instruction, and a base_environment that can mount AGENTS.md and a skills repo. You then invoke it by ID, so application code stays a one-liner while ops owns the agent definition and versioning.

Is the Antigravity SDK the same as the Interactions API?

The Antigravity SDK and the agy CLI are wrappers over the same Interactions API. Commands like agy agents init, agy agents test, and agy agents create resolve to client.agents.create() and client.interactions.create() under the hood. Learn the raw API first; the CLI is best for scaffolding and CI. Note that the legacy Gemini CLI stops serving requests on June 18, 2026.

Primary sources

Managed Agents Quickstart — Google AI for Developers
Antigravity Agent reference — Google AI for Developers
Introducing Managed Agents in the Gemini API — Google
Transitioning Gemini CLI to Antigravity CLI — Google Developers Blog
I/O ’26 news for agent developers on Google Cloud — Google Cloud Blog
Gemini Developer API pricing — Google AI for Developers

Last updated: June 6, 2026. Related: Agent Infrastructure.

Gemini Managed Agents API Tutorial: Production Python Guide

What is the Gemini Managed Agents API?

How do I install and run a single-turn agent?

How does multi-turn persistence and billing actually work?

How do I override the default tools?

Pros

Cons

How do I save a reusable managed agent with AGENTS.md and SKILL.md?

How do the Antigravity SDK and agy CLI map to the Interactions API?

Production checklist: persistence, cost, and observability

Worth it for managed sandboxes, but own your IDs and your budget

Builder’s take

Frequently asked questions

What is the agent ID for the Gemini Managed Agents API?

How much does a Gemini Managed Agents API interaction cost?

How do I keep a multi-turn session in the same sandbox?

How do I override the default tools in a managed agent?

What is the difference between inline config and a saved managed agent?

Is the Antigravity SDK the same as the Interactions API?

Primary sources

Leave a Reply Cancel reply

More Popular from Alatirok

Tokens Per Agentic Coding Task: The 2026 Variance Data

What Is Cognition Devin? The Enterprise Guide for 2026

What Is Circle Agent Stack? USDC Wallets for AI Agents

AI Agent Identity: Entra Agent ID vs Okta vs SailPoint

Why Does My AI Agent Context Window Fill Up So Fast?

Migrate OpenAI Agent Builder to Agents SDK Before Nov 30

Best Voice AI Agent Framework 2026: Vapi vs LiveKit vs Pipecat

Purpose-Built Legal AI vs General LLM: 2026 Verdict

Categories

Quick Links