← Back to Learn
deploymentbest-practicesagent-safety

Microservice architecture for AI agent safety

Authensor

For large deployments, running the safety stack as a set of microservices provides independent scaling, clear boundaries, and language-agnostic integration. This guide covers the architecture.

Service decomposition

The safety stack can be decomposed into independent services:

[Policy Service]     - Stores and serves policies
[Evaluation Service] - Evaluates tool calls against policies
[Scanning Service]   - Content scanning (Aegis)
[Receipt Service]    - Stores and verifies receipts
[Approval Service]   - Manages approval workflows
[Monitoring Service] - Behavioral monitoring (Sentinel)

When microservices make sense

Microservices add operational complexity. They make sense when:

  • Multiple teams own different safety components
  • Different components need independent scaling (scanning may need more resources than policy evaluation)
  • You need language-agnostic integration (Python agents and TypeScript agents sharing the same safety services)
  • You want to update components independently

When they do not make sense

For most deployments, the Authensor control plane runs as a single service. This is simpler and sufficient for:

  • Teams with fewer than 50 agent instances
  • Single-team ownership of the safety stack
  • Deployments where all agents use the same language

The hybrid approach

Use the SDK for in-process safety (fast, no network calls) and the control plane as a single service for centralized management:

[Agent] → [Embedded SDK] → [Control Plane API]
            (evaluation)     (policy storage, receipts, approvals)

The SDK handles the latency-sensitive work (policy evaluation, content scanning) in-process. The control plane handles the stateful work (policy storage, receipt persistence, approval workflows) as a service.

API boundaries

If you decompose into services, define clear API boundaries:

POST /api/evaluate
  Input: { tool, args, context }
  Output: { action, reason, receipt }

POST /api/scan
  Input: { content, detectors }
  Output: { threats: [...] }

POST /api/receipts
  Input: { receipt }
  Output: { id, hash }

POST /api/approvals
  Input: { requestId, tool, args, reviewers }
  Output: { approvalId, status }

Communication patterns

Synchronous (HTTP): For policy evaluation and content scanning, where the agent needs the result before proceeding.

Asynchronous (message queue): For receipt storage and monitoring, where the agent does not need to wait for the result.

Event-driven: For alerts and notifications, where the monitoring service publishes events that other services consume.

Scaling considerations

| Component | Scaling strategy | |-----------|-----------------| | Policy evaluation | Horizontal (stateless) | | Content scanning | Horizontal (stateless) | | Receipt storage | Vertical (database) | | Approval workflows | Horizontal (stateless) | | Monitoring | Per-agent (in-process) |

Policy evaluation and content scanning are stateless and scale horizontally. Receipt storage is bound by database write throughput. Monitoring runs in-process per agent, so it scales with the number of agents.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides