← Back to Learn
monitoringexplaineragent-safety

What is session risk scoring for AI agents?

Authensor

Session risk scoring is a technique that assigns a running risk score to an AI agent session based on the actions taken during that session. The score increases when the agent takes risky actions and decreases over time. Policies can reference the risk score to become more or less restrictive based on how the session is going.

How it works

Each action contributes to the session risk score:

  • A blocked action adds risk (the agent tried something it should not)
  • An escalated action adds moderate risk
  • An allowed action may add a small amount or none
  • Certain tool types carry inherent risk weights

The score decays over time, so a single risky action does not permanently elevate the session risk.

Session starts: risk = 0
Agent reads files: risk = 0
Agent writes file: risk = 0.1
Agent tries blocked command: risk = 0.4
Time passes: risk decays to 0.3
Agent tries another blocked command: risk = 0.7

Adaptive policies

Policies can reference the risk score to change behavior dynamically:

rules:
  - tool: "file.write"
    action: allow
    when:
      context.riskScore:
        lt: 0.5

  - tool: "file.write"
    action: escalate
    when:
      context.riskScore:
        gte: 0.5
    reason: "Elevated session risk requires approval for writes"

When the session starts, file writes are allowed. After the agent accumulates enough risk (from denied actions or suspicious behavior), file writes require approval.

Risk factors

Different events contribute different amounts of risk:

| Event | Risk Weight | |-------|------------| | Allowed action | 0.0 | | Escalated action | 0.1 | | Blocked action | 0.3 | | Aegis threat detected | 0.5 | | Sentinel anomaly alert | 0.4 | | Multiple denials in window | 0.3 |

Configuration

const guard = createGuard({
  policy,
  riskScoring: {
    enabled: true,
    decayRate: 0.01,      // Risk decays by 0.01 per second
    maxScore: 1.0,
    blockThreshold: 0.9,  // Auto-block all actions above this score
  }
});

Why adaptive risk matters

Static policies treat every session the same. But a session where the agent has attempted five blocked actions is inherently riskier than a session where every action has been allowed. Risk scoring lets your safety system respond proportionally to the actual threat level in the current session.

This creates a natural escalation path: as an agent behaves more suspiciously, the system becomes more restrictive. This is useful for catching slow-moving attacks where each individual action is not enough to trigger a static rule.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides