A cascading failure in a multi-agent system occurs when a problem in one agent propagates to other agents, causing a chain reaction of failures. What starts as a single compromised or malfunctioning agent can bring down an entire system if agents pass data, instructions, or results to each other without safety checks.
In a multi-agent system, agents collaborate by exchanging messages, passing tool results, or delegating tasks. When Agent A sends output to Agent B, Agent B trusts that output. If Agent A is compromised (by prompt injection, for example), its output may contain:
Agent B then passes its corrupted output to Agent C, and the failure propagates.
Each agent did what it was told. The failure cascaded because no agent verified what the previous agent gave it.
Inter-agent content scanning: Scan messages between agents just like you scan user input. If Agent A sends data to Agent B, Agent B's safety layer scans it for injection and content threats.
Per-agent policies: Each agent should have its own policy with its own tool restrictions. A compromised agent cannot escalate privileges through another agent if that agent has a restrictive policy.
Circuit breakers: If an agent starts producing errors or triggering policy violations at an unusual rate, stop accepting its output. This prevents a malfunctioning agent from corrupting others.
Cross-agent tracing: Track actions across agent boundaries. When Agent A delegates to Agent B, link their receipt chains. This lets you trace the full path of a cascading failure during investigation.
const guard = createGuard({
policy,
context: {
sourceAgent: 'agent-a',
requestId: 'req_123',
traceId: 'trace_abc',
}
});
Blast radius limits: Design your system so that a failure in one agent has bounded impact. Do not let a single agent's output trigger unrestricted actions in others. Apply rate limits and scope limits at every agent boundary.
Safety in multi-agent systems costs more than in single-agent systems. Every inter-agent boundary is an attack surface. Every message is a potential injection vector. The number of attack paths grows with the number of agents and their connections. Account for this cost when deciding whether to use a multi-agent architecture.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides