MCP Security in 2026: Locking Down Tool Poisoning

A hands-on playbook for defending against tool poisoning, rug pulls, and registry attacks, mapped to the OWASP MCP Top 10.

Contents

What is MCP security and why tool poisoning changed the threat model

MCP security is the practice of protecting the connection between an AI agent and the Model Context Protocol servers that supply its tools, and in 2026 its defining threat is tool poisoning, where a malicious server hijacks the agent through a tool description it never even has to call. That last clause is the whole game. Traditional appsec assumes code runs before it does damage; MCP breaks that assumption because a tool description is loaded straight into the model’s context the moment a server connects.

Invariant Labs documented the mechanic precisely: AI models see the full tool description while users see a simplified UI label, so attackers hide directives, often wrapped in tags like , that instruct the agent to read SSH keys or reroute data. The worst variant is shadowing, where a benign-looking tool from one server rewrites how the agent treats a trusted tool from another, for example forcing every email through the email tool to the attacker’s address. Strong MCP security has to assume the payload is already inside the context window.

The scale is not theoretical. The MCPTox benchmark, built on 45 live MCP servers and 353 real tools, generated 1,312 malicious test cases and found a 36.5% average attack success rate across 20 leading agents, with o1-mini hitting 72.8%. Refusal rates were dismal, the best performer (Claude-3.7-Sonnet) declined under 3% of attacks. Capable, instruction-following models were more susceptible, not less.

Developer reviewing AI agent tool connections on a terminal and dashboard screen — Image.

The agent reads the entire tool description; the human reads a one-line label. Any MCP security control that only inspects what the user sees is inspecting the wrong artifact. Hash and scan what the model actually ingests.

The OWASP MCP Top 10 mapped to real attacks

The OWASP MCP Top 10 is the first dedicated security framework for the Model Context Protocol, cataloguing the ten risk categories every agent operator should design against, from tool poisoning (MCP03) to shadow MCP servers (MCP09). Led by Vandana Verma Sehgal and still in beta, it gives MCP security a shared vocabulary the way the original OWASP Top 10 did for web apps.

It maps cleanly onto the incidents of the past year. MCP03 (tool poisoning) and MCP06 (prompt injection via contextual payloads) describe the Invariant shadowing attacks. MCP05 (command injection) was 2026’s largest CVE pattern, OX Security tied a single STDIO design flaw to 10+ critical CVEs. MCP09 (shadow MCP servers) and MCP04 (supply chain) describe the registry poisoning below. Use the table as a triage checklist, not bedtime reading.

“The dangerous moment in MCP is connection, not invocation. A poisoned tool wins the instant its description enters the context window.”
Surya Koritala, founder of Cyntr

Code	Risk	Primary defense
MCP01	Token mismanagement and secret exposure	Short-lived scoped tokens; never store secrets in agent context
MCP02	Privilege escalation via scope creep	Least-privilege scopes, automated expiry, periodic access reviews
MCP03	Tool poisoning	Pin and hash tool descriptions; re-prompt on any change
MCP04	Supply chain and dependency tampering	Signed components, SBOM tracking, block servers failing CI scans
MCP05	Command injection and execution	Parameterized calls, strict validation, sandboxed execution
MCP06	Prompt injection via contextual payloads	Treat all tool output as data; human approval for destructive actions
MCP07	Insufficient authentication and authorization	OAuth 2.1 with PKCE per server; reject token passthrough
MCP08	Lack of audit and telemetry	Log every invocation with parameters into a correlated SIEM
MCP09	Shadow MCP servers	Approved-server allowlist at the gateway; verify server identity
MCP10	Context injection and over-sharing	Field-level access control and DLP at the proxy

OWASP MCP Top 10 (2025, beta) with the primary defense for each

The registry supply chain is poisoned: what OX Security found

9 of 11

public MCP registries poisoned in OX Security testing

April 2026 disclosure

~200,000

MCP instances estimated vulnerable to the STDIO flaw

ecosystem-wide estimate

36.5%

average tool-poisoning attack success rate across 20 agents

MCPTox benchmark

In an April 15, 2026 disclosure, OX Security successfully poisoned 9 of 11 public MCP registries with a malicious proof-of-concept server and traced a single architectural flaw to as many as 200,000 vulnerable MCP instances. The registries accepted the submissions with essentially no vetting, which means “I got it from the official registry” is not a security statement in 2026.

The root cause is the STDIO execution model: any process command passed to an MCP server’s STDIO interface runs on the host whether or not it spins up a valid server. OX reported a supply chain with 150M+ downloads and 7,000+ publicly accessible servers, and issued 10+ critical and high CVEs from that one root cause, with live RCE demonstrated against frameworks including LiteLLM, LangChain, and IBM’s Langflow. Anthropic, per OX’s report, confirmed the STDIO behavior is by design, called it a secure default, and placed sanitization on developers, so the burden of MCP security sits with you, the operator.

Treat this as the supply-chain wake-up call it is. The registry is a discovery convenience, not a trust boundary. Every server you adopt needs independent verification before it touches a production agent, which is exactly what the hardening steps below enforce. Note that these counts are point-in-time, figures move as registries add vetting and CVEs are patched.

How to harden MCP servers: a step-by-step MCP security tutorial

To lock down MCP security, scan every server before adoption, pin its tool definitions by hash, run it behind an allowlisting gateway, scope its OAuth tokens to that one server, and require human approval for sensitive actions. These five controls map directly to MCP03, MCP04, MCP09, MCP07, and MCP06, and they compose, no single control is sufficient on its own.

Start with scanning. Invariant Labs’ open-source mcp-scan inspects installed servers and their tool descriptions for tool poisoning, rug pulls, cross-origin escalation (shadowing), and prompt injection, and its tool-pinning feature hashes each definition so a later change is detected instead of silently accepted, the exact failure behind the Cursor rug pull. Wire it into CI so a server cannot reach your allowlist without a clean scan.

Scanning catches known-bad descriptions, pinning catches drift, allowlisting catches rogue servers, scoped tokens contain blast radius, and human approval catches what slips through. Ship all five, the OWASP MCP Top 10 assumes you will.

# 1. Scan a server BEFORE you trust it (Invariant Labs mcp-scan)
pipx install mcp-scan
mcp-scan scan ~/.config/your-agent/mcp.json

# 2. Pin tool definitions so a rug pull is detected, not auto-accepted
mcp-scan scan --pin   # records a hash of every tool description
mcp-scan scan         # re-run later; any drift flags a CVE-2025-54136-style swap

# 3. Keep mcp-scan resident as a proxy that enforces guardrails at runtime
mcp-scan proxy

# 4. Gate it in CI: fail the build if any server changed or fails the scan
mcp-scan scan --json | jq -e '.[].issues | length == 0'

Step 1 — Scan and pin every tool definition (MCP03)

Run mcp-scan against your client config before connecting any new server, then pin. Pinning hashes each tool description; if the server later swaps in a malicious version, the hash mismatch surfaces it instead of letting it load silently. This is the direct antidote to the rug-pull pattern in CVE-2025-54136, where Cursor never re-validated an approved config.

Step 2 — Allowlist registries and servers at a gateway (MCP09, MCP04)

Route all MCP traffic through a gateway that checks each connection against a private registry of vetted servers. If a server is not on the approved list, the connection is refused. This neutralizes shadow servers and the poisoned-public-registry problem, your trust comes from your own review, not the upstream marketplace.

Step 3 — Scope tokens per server with OAuth 2.1 and RFC 8707 (MCP07, MCP01)

The MCP authorization spec (2026-03-15) makes RFC 8707 resource indicators mandatory: clients MUST send the canonical URI of the target server, and servers MUST validate they are the intended audience. That binds a token to one server so a malicious server cannot replay a token issued for another. Combine with PKCE and short-lived, narrowly scoped grants (orders:read, not a god token).

Step 4 — Require human approval for write, shell, and send actions (MCP06, MCP05)

Put a human-in-the-loop gate in front of any destructive or exfiltration-capable tool, file writes, shell execution, outbound email or payments. MCPTox showed alignment alone refuses poisoned instructions under 3% of the time, so approval gates and sandboxed, parameterized execution are what actually stop a poisoned description from acting.

Step 5 — Log every invocation and treat tool output as data (MCP08, MCP10)

JSON-RPC traffic does not fit standard SIEM rules, so log each invocation with its parameters and the agent’s decision, with correlation IDs, and pipe it to your SIEM. Treat all tool return values as untrusted data, never as instructions, and apply field-level access control plus DLP at the proxy to prevent over-sharing across sessions.

OAuth 2.1 and RFC 8707: the identity layer MCP security now requires

Under the 2026-03-15 MCP authorization specification, OAuth 2.1 plus RFC 8707 resource indicators are mandatory, every client must declare the exact server a token is for, and every server must verify it is the intended audience. This is the protocol-level fix for token confusion, and it is where MCP security stops being advice and becomes a spec requirement.

RFC 8707 works by carrying a resource parameter, the canonical URI of the target MCP server, in both the authorization and token requests. The authorization server mints a token audience-bound to that one server, so a malicious or compromised server cannot take a token it received and replay it against a different, trusted service. Pair it with PKCE on every flow and a firm rule to reject token passthrough, which the OWASP MCP Top 10 calls out under MCP07.

Identity and provenance are the connective tissue here: scoped, audience-bound tokens are what make least privilege (MCP02) and secret hygiene (MCP01) enforceable rather than aspirational. If your agent platform predates the 2026-03-15 spec, auditing your authorization flow for RFC 8707 compliance is the single highest-leverage MCP security upgrade you can make this quarter.

Audience-bound tokens are the difference between a stolen token being useless and being a skeleton key. RFC 8707 resource indicators are non-negotiable in 2026 MCP security.

A 30-day MCP security rollout plan

Treat connection as the attack, not invocation

MCP security in 2026 is decided before a tool ever runs. Scan and pin every tool definition, allowlist servers at a gateway instead of trusting registries, bind tokens to one server with OAuth 2.1 and RFC 8707, and gate sensitive actions behind a human, because benchmarks show models comply with poisoned instructions far more often than they refuse. Layer all five controls against the OWASP MCP Top 10 and you turn a 36% attack surface into a contained, auditable one. Figures and CVE counts move, so re-scan on a schedule.

The fastest path to defensible MCP security is to inventory and scan in week one, allowlist and pin in week two, enforce scoped OAuth 2.1 tokens in week three, and add approval gates plus logging in week four. Sequencing matters, you cannot allowlist servers you have not yet inventoried, and approval gates are noise until least-privilege scopes cut down what needs approving.

None of this requires ripping out your agent stack. mcp-scan is open source and runs against your existing client config, gateways like IBM ContextForge and the commercial options sit in front of servers you already use, and RFC 8707 support ships in the current spec. The work is operational discipline, not a rewrite. The teams that get breached in 2026 are the ones still trusting registries by default and re-approving changed configs without a second look.

Builder’s take

I run Cyntr, an agent orchestration runtime, so MCP servers are a first-class attack surface for me, not a thought experiment. The thing that took me a while to internalize is that the dangerous moment is connection, not invocation, a poisoned tool wins as soon as its description enters the context window.

Treat every tool description as untrusted input the moment it loads, pin it by hash, and refuse to silently re-approve a changed definition the way Cursor’s CVE-2025-54136 rug-pull did.
Run mcp-scan in CI against every server before it reaches an allowlist, in Cyntr I gate registry entries on a passing scan so a 9-of-11-poisoned-registries situation cannot reach production.
Scope tokens per server with RFC 8707 resource indicators, an orders:read agent should never hold a token a malicious server can replay against your email tool.
Put a human approval gate in front of any write, shell, or send action, MCPTox showed top models comply with poisoned instructions over 36% of the time and refuse under 3%, so do not trust alignment to save you.

Frequently asked questions

What is tool poisoning in MCP?

Tool poisoning is an attack where a malicious MCP server embeds hidden instructions in a tool’s description, which the AI model reads in full while the user sees only a simple label. Because the description enters the agent’s context the moment the server connects, the attack can hijack the agent without the poisoned tool ever being called.

Does a poisoned MCP tool have to be invoked to cause harm?

No. The most dangerous variant, shadowing, rewrites how the agent treats other trusted tools, so a malicious tool can reroute emails or exfiltrate data without being called itself. CVE-2025-54136 (MCPoison) showed a related rug pull, where an approved Cursor config was swapped for a malicious one and never re-validated.

What is the OWASP MCP Top 10?

The OWASP MCP Top 10 is the first dedicated security framework for the Model Context Protocol, listing ten risk categories from token mismanagement (MCP01) to context over-sharing (MCP10). It is currently in beta and gives teams a shared checklist for MCP security design and review.

How do I scan an MCP server for vulnerabilities?

Use Invariant Labs’ open-source mcp-scan, which inspects installed servers and their tool descriptions for tool poisoning, rug pulls, cross-origin escalation, and prompt injection. Its tool-pinning feature hashes each definition so any later change is detected rather than silently accepted, and you can run it in CI as a gate.

Why does MCP require OAuth 2.1 and RFC 8707?

The 2026-03-15 MCP authorization spec mandates OAuth 2.1 with RFC 8707 resource indicators so every token is bound to a specific server’s canonical URI. This prevents a malicious server from replaying a token issued for a different, trusted service, closing the token-confusion gap in earlier MCP deployments.

Are MCP registries safe to trust by default?

No. OX Security poisoned 9 of 11 public MCP registries with a malicious server in April 2026, showing most registries do little vetting. Treat registries as a discovery convenience only, and adopt servers solely through your own allowlist after an independent scan.

Primary sources

OWASP MCP Top 10 project — OWASP Foundation
Tool Poisoning Attacks security notification — Invariant Labs
The Mother of All AI Supply Chains: systemic MCP vulnerability — OX Security
Critical MCP flaw exposes ~200,000 AI servers — Tom’s Hardware
MCP Authorization specification (OAuth 2.1, RFC 8707) — Model Context Protocol
MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers — arXiv

Last updated: May 30, 2026. Related: Identity Provenance.

MCP Security in 2026: Locking Down Tool Poisoning

What is MCP security and why tool poisoning changed the threat model

The OWASP MCP Top 10 mapped to real attacks

The registry supply chain is poisoned: what OX Security found

How to harden MCP servers: a step-by-step MCP security tutorial

OAuth 2.1 and RFC 8707: the identity layer MCP security now requires

A 30-day MCP security rollout plan

Treat connection as the attack, not invocation

Builder’s take

Frequently asked questions

What is tool poisoning in MCP?

Does a poisoned MCP tool have to be invoked to cause harm?

What is the OWASP MCP Top 10?

How do I scan an MCP server for vulnerabilities?

Why does MCP require OAuth 2.1 and RFC 8707?

Are MCP registries safe to trust by default?

Primary sources

Leave a Reply Cancel reply

More Popular from Alatirok

Tokens Per Agentic Coding Task: The 2026 Variance Data

What Is Cognition Devin? The Enterprise Guide for 2026

What Is Circle Agent Stack? USDC Wallets for AI Agents

AI Agent Identity: Entra Agent ID vs Okta vs SailPoint

Why Does My AI Agent Context Window Fill Up So Fast?

Migrate OpenAI Agent Builder to Agents SDK Before Nov 30

Best Voice AI Agent Framework 2026: Vapi vs LiveKit vs Pipecat

Purpose-Built Legal AI vs General LLM: 2026 Verdict

Categories

Quick Links