Attack Prevention Guides

Every attack × every framework. Real payloads, working code.

Grouped by attack class. Each entry names the exact surface it exploits in a given framework, shows vulnerable code, and hands you the fix. Start with prompt injection if you're not sure — it's the gateway attack for 14% of production sessions.

Prompt Injection

LangChain

How to Prevent Prompt Injection in LangChain Agents

LangChain agents are uniquely vulnerable to prompt injection because they combine LLM reasoning with tool access and document retrieval. A single injected instruction in a retrieved document can cause the agent to execute arbitrary tool calls, leak data, or ignore its system prompt entirely.

Read

OpenAI

How to Prevent Prompt Injection in OpenAI Agents

OpenAI function calling agents are vulnerable to prompt injection attacks that manipulate the model into generating malicious function calls. Unlike simple chatbots, function calling agents execute real actions — making injection attacks operational, not just informational.

Read

Anthropic Claude

How to Prevent Prompt Injection in Claude Agents

Claude's 200K context window and sophisticated tool use make it a powerful agent backbone, but the large context window is a double-edged sword for security. Attackers have 200K tokens of space to hide injection attempts — far more than any human reviewer can check manually.

Read

CrewAI

How to Prevent Prompt Injection in CrewAI Workflows

CrewAI's multi-agent architecture amplifies prompt injection risk exponentially. When an injection compromises one agent, its manipulated output flows to the next agent as trusted input — cascading the attack through the entire crew.

Read

MCP

How to Prevent Prompt Injection via MCP Servers

MCP servers are a novel injection vector. When your agent calls an MCP tool, the server controls the response — and a malicious or compromised server can return tool results containing injected instructions that hijack your agent's behavior.

Read

Data Exfiltration

LangChain

How to Prevent Data Exfiltration in LangChain Agents

LangChain agents with tool access can read databases, files, and APIs — and then send that data anywhere. Data exfiltration through AI agents is harder to detect than traditional exfiltration because it happens through natural language interactions rather than network traffic patterns.

Read

OpenAI

How to Prevent Data Exfiltration in OpenAI Agents

OpenAI function calling agents can be manipulated into extracting and transmitting sensitive data through function parameters. The model generates both the function name and arguments, giving an attacker control over where data goes.

Read

Anthropic Claude

How to Prevent Data Exfiltration in Claude Agents

Claude's large context window means agents process massive amounts of data in a single turn. When that data includes sensitive information, the risk of exfiltration through tool calls or agent responses increases proportionally.

Read

CrewAI

How to Prevent Data Exfiltration in CrewAI Workflows

CrewAI multi-agent workflows create unique data exfiltration risks. When agents pass data to each other, sensitive information can flow from a high-access agent to one with external communication tools — enabling data theft through agent-to-agent handoffs.

Read

MCP

How to Prevent Data Exfiltration via MCP Servers

MCP servers receive tool call parameters from your agent — which may contain sensitive data. A malicious MCP server can silently log, store, or transmit any data your agent sends through its tools.

Read

Tool Manipulation

LangChain

How to Prevent Tool Manipulation in LangChain Agents

LangChain agents choose which tools to call and generate the parameters for each call. Tool manipulation attacks exploit this by causing the agent to call the wrong tools, pass malicious parameters, or chain tools in unauthorized sequences.

Read

OpenAI

How to Prevent Tool Manipulation in OpenAI Agents

OpenAI function calling gives GPT-4 control over which functions to call and what parameters to pass. Tool manipulation attacks exploit this by causing the model to generate malicious function calls — with injected parameters, unauthorized function selections, or abusive parallel call patterns.

Read

CrewAI

How to Prevent Tool Manipulation in CrewAI Workflows

CrewAI crews combine multiple agents, each with their own tools. Tool manipulation attacks exploit the interaction between agents' tool sets — one agent's legitimate tool use can set up an attack executed by another agent's tools.

Read

MCP

How to Prevent Tool Manipulation via MCP Servers

MCP servers define which tools are available to your agent. A malicious or compromised MCP server can advertise dangerous tools with innocent-sounding names, alter tool behavior between sessions, or execute unauthorized actions when tools are called.

Read

Privilege Escalation

LangChain

How to Prevent Privilege Escalation in LangChain Agents

LangChain agents can be manipulated into accessing tools, data, or capabilities beyond their intended permissions. Privilege escalation in agents is different from traditional systems — the 'privilege' is which tools the agent can use, which data it can access, and what actions it can take.

Read

OpenAI

How to Prevent Privilege Escalation in OpenAI Agents

OpenAI function calling agents can be manipulated into calling functions they shouldn't have access to. If your function list includes both user-level and admin-level functions, an injection can cause the model to call admin functions on behalf of a regular user.

Read

Policy Violation

LangChain

How to Enforce Security Policies in LangChain Agents

LangChain agents without policy enforcement operate on the honor system — trusting the LLM to follow its system prompt. Security policies need to be enforced at the infrastructure level, not in the prompt. Rune's YAML policy engine lets you define rules that are enforced at the middleware layer, independent of the model's behavior.

Read

OpenAI

How to Enforce Security Policies in OpenAI Agents

OpenAI's system prompt gives you soft guidance over model behavior, but it can't enforce security policies. Rune's policy engine adds infrastructure-level enforcement — rules that are applied consistently regardless of what the model decides to do.

Read

Behavioral Anomaly

LangChain

How to Detect Behavioral Anomalies in LangChain Agents

Not all attacks trigger pattern-based detection. Sophisticated attacks cause agents to behave subtly differently — using tools slightly more than usual, accessing data in unusual patterns, or producing responses with different characteristics. Behavioral anomaly detection catches what pattern matching misses.

Read

OpenAI

How to Detect Behavioral Anomalies in OpenAI Agents

OpenAI function calling agents develop predictable patterns — which functions they call, how often, and with what parameters. Behavioral anomaly detection monitors these patterns and alerts when something changes, catching sophisticated attacks that bypass pattern-based detection.

Read

Frequently Asked Questions

Which guide should I start with?

Start with the prompt injection guide for your framework. Prompt injection is the most common attack (14% of sessions) and the gateway to most other attacks. Once you've secured against injection, move to data exfiltration and tool manipulation guides.

Do I need framework-specific protection or is generic security enough?

Framework-specific protection is significantly more effective. Each framework has unique attack surfaces — LangChain's callback system, CrewAI's inter-agent communication, MCP's tool protocol. Generic security misses framework-specific vulnerabilities that attackers actively exploit.

Can I use these guides without Rune?

Yes — each guide includes general prevention strategies that work independently. However, the detection code examples use Rune's SDK because manual implementation of multi-layer scanning (pattern matching + semantic analysis + LLM judgment) requires significant engineering effort to build and maintain.

How often are new guides added?

We add guides whenever a new attack technique or framework integration is identified. The threat landscape for AI agents evolves rapidly — new injection techniques emerge monthly. Follow our blog for announcements of new guides and threat reports.

Same engine, free tier.

Install, wrap, scan. 10,000 events/month.

Start free