Approval workflow deadlocks occur when an agent cannot proceed because it is waiting for approvals that never arrive. The agent stalls, the user waits, and no one realizes the bottleneck exists until someone investigates. This guide covers the common deadlock patterns and their solutions.
The approval is routed to a team, but no one on the team is available. This happens during off-hours, holidays, or when the designated approver is on leave.
Fix: Configure fallback approvers. Define a primary approver group and a secondary group that receives the request if the primary does not respond within a specified window:
approval:
timeout: 300
approvers: ["primary-team"]
fallback:
timeout: 600
approvers: ["backup-team"]
When an approval request times out, the default behavior is to deny the action. But the agent may not handle the denial gracefully. It might retry indefinitely, creating a loop of approval requests and timeouts.
Fix: Configure the agent to treat approval timeouts as terminal failures for the current task. Set a maximum retry count in the policy:
approval:
timeout: 300
max_retries: 1
on_timeout: deny
Agent A requires approval from Agent B, but Agent B requires approval from Agent A to proceed with its own task. Neither can complete.
Fix: Avoid configuring agents as approvers for other agents. Approvers should be human users or automated systems with independent authority. If automated approval is needed, use a dedicated approval service that does not have its own approval requirements.
Approvers never see the request because the notification system is misconfigured. Email goes to spam. Webhook delivery fails. The Slack channel is muted.
Fix: Monitor notification delivery. Track whether approval requests are acknowledged (the approver viewed the request) versus just sent. Alert on requests that are sent but not viewed within a reasonable window.
The approver receives a request but does not have enough context to approve or deny it. They delay, hoping for more information.
Fix: Enrich approval requests with context. Include the agent's reasoning, the specific tool call parameters, and the user request that triggered the action. Make it easy for the approver to make a quick, informed decision.
Monitor your approval pipeline metrics: median time to approval, timeout rate, and retry rate. Rising timeout rates indicate a systemic issue that needs structural intervention.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides