← Back to Learn
monitoringagent-safetytutorial

Custom Metrics for AI Agent Behavior

Authensor

Standard infrastructure metrics (CPU, memory, network) tell you whether the system is running. They do not tell you whether agents are behaving correctly. Custom behavioral metrics capture the aspects of agent behavior that matter for safety and operational awareness.

Categories of Behavioral Metrics

Action Metrics

Track what agents are doing:

  • Action rate: Actions per second by type and agent
  • Action distribution: Proportion of each action type over time
  • Novel action rate: Frequency of action types not seen during baseline
  • Resource concentration: How concentrated resource access is (are agents accessing many resources or the same few repeatedly?)

Safety Metrics

Track safety enforcement:

  • Deny rate: Proportion of actions denied by policy
  • Scan flag rate: Proportion of inputs flagged by Aegis
  • Approval latency: Time from approval request to approval decision
  • Override frequency: How often emergency overrides are used

Quality Metrics

Track output quality:

  • Output length distribution: Average and variance of output token counts
  • Error rate: Proportion of actions that result in errors
  • Retry rate: How often agents retry failed actions
  • Hallucination indicators: Metrics that correlate with output quality (model confidence scores, retrieval relevance scores)

Implementation

Emit custom metrics as Prometheus-compatible counters, gauges, and histograms.

const actionCounter = new Counter({
  name: 'agent_actions_total',
  help: 'Total agent actions by type and agent',
  labelNames: ['action_type', 'agent_id', 'decision'],
});

// On each action
actionCounter.inc({
  action_type: envelope.action,
  agent_id: envelope.principal,
  decision: result.effect,
});

Cardinality Management

Custom metrics with high-cardinality labels (unique user IDs, arbitrary resource paths) can overwhelm metrics storage. Keep label cardinality bounded. Use categories instead of specific values. Group resources by type rather than tracking individual paths.

Alerting on Custom Metrics

Define alert rules based on custom metrics. A sudden drop in action rate might indicate an agent crash. A spike in deny rate might indicate a policy misconfiguration or an attack. A jump in novel action rate might indicate prompt injection.

Dashboard Design

Group related metrics into dashboard panels. Create separate dashboards for operational overview (is the system healthy?), safety overview (is the system safe?), and per-agent detail (what is this specific agent doing?).

Custom metrics turn agent behavior into something you can measure, alert on, and improve over time.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides