Standard infrastructure metrics (CPU, memory, network) tell you whether the system is running. They do not tell you whether agents are behaving correctly. Custom behavioral metrics capture the aspects of agent behavior that matter for safety and operational awareness.
Track what agents are doing:
Track safety enforcement:
Track output quality:
Emit custom metrics as Prometheus-compatible counters, gauges, and histograms.
const actionCounter = new Counter({
name: 'agent_actions_total',
help: 'Total agent actions by type and agent',
labelNames: ['action_type', 'agent_id', 'decision'],
});
// On each action
actionCounter.inc({
action_type: envelope.action,
agent_id: envelope.principal,
decision: result.effect,
});
Custom metrics with high-cardinality labels (unique user IDs, arbitrary resource paths) can overwhelm metrics storage. Keep label cardinality bounded. Use categories instead of specific values. Group resources by type rather than tracking individual paths.
Define alert rules based on custom metrics. A sudden drop in action rate might indicate an agent crash. A spike in deny rate might indicate a policy misconfiguration or an attack. A jump in novel action rate might indicate prompt injection.
Group related metrics into dashboard panels. Create separate dashboards for operational overview (is the system healthy?), safety overview (is the system safe?), and per-agent detail (what is this specific agent doing?).
Custom metrics turn agent behavior into something you can measure, alert on, and improve over time.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides