Drift detection is the process of identifying when an AI agent's behavior gradually changes over time, moving away from its established baseline. Unlike anomaly detection, which flags sudden spikes, drift detection catches slow, persistent changes that accumulate into significant deviations.
An agent's behavior can drift for several reasons:
An anomaly is a sudden, sharp deviation. The agent's denial rate jumps from 5% to 50% in one minute. This is easy to detect.
Drift is a slow change. The agent's denial rate goes from 5% to 6% to 7% over a week. Each individual measurement looks normal. Only when you compare the current state to the original baseline does the drift become visible.
EWMA (Exponentially Weighted Moving Average): Maintains a smoothed average that gives more weight to recent values. The smoothing factor (alpha) controls sensitivity to drift. A low alpha catches slow drift; a high alpha focuses on recent changes.
CUSUM (Cumulative Sum): Accumulates deviations from the expected value. Small deviations that individually look insignificant add up in the cumulative sum. When the sum exceeds a threshold, drift is detected.
Baseline comparison: Periodically compare current metric distributions to the original baseline using statistical tests. This catches drift that both EWMA and CUSUM might miss.
Sentinel tracks these metrics per session and across sessions. When drift is detected, it raises an alert with the metric, the baseline value, the current value, and the rate of change.
sentinel: {
enabled: true,
drift: {
enabled: true,
baselineWindow: 7 * 24 * 60 * 60 * 1000, // 7-day baseline
threshold: 0.15, // 15% deviation triggers alert
}
}
When drift is detected, investigate the cause before taking action. Drift is not always bad. A model update might cause the agent to use different tools, which is expected. But unexpected drift after no known changes should be investigated as a potential security issue.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides