The sidecar proxy pattern deploys a safety enforcement process alongside each agent instance. All traffic between the agent and external services routes through the sidecar, which applies policy checks, content scanning, and audit logging transparently. This approach adds safety to existing agents without modifying their code.
In a Kubernetes pod, the sidecar container shares the network namespace with the main agent container. Configure the agent to send requests to localhost on the sidecar's port instead of directly to external services. The sidecar intercepts each request, evaluates it against Authensor's policy engine, and either forwards it or blocks it.
For MCP-based agents, the sidecar acts as an MCP gateway, sitting between the agent and MCP servers. Every tool call passes through policy evaluation before reaching the server.
Zero code changes. The agent does not need an SDK integration. Safety is enforced at the network layer.
Consistent enforcement. Every request goes through the same policy checks regardless of which SDK, framework, or language the agent uses.
Independent updates. You can update safety policies and scanning rules by redeploying the sidecar without touching the agent container.
Per-agent configuration. Each sidecar can have its own policy file, allowing different safety rules for different agents in the same cluster.
The sidecar runs Authensor's control plane in local mode, loading policies from a ConfigMap. For lightweight deployments, it can run just the policy engine and Aegis scanner without a database connection, logging audit events to stdout for collection by your log aggregator.
Set resource limits on the sidecar container. Policy evaluation typically needs less than 50 MB of memory and minimal CPU. Content scanning needs more resources if you enable ML-based detection.
The sidecar adds a network hop on localhost, which adds less than a millisecond. Policy evaluation adds 1 to 5 milliseconds. Total overhead is typically under 10 milliseconds per request, which is negligible compared to LLM inference latency.
Use connection pooling between the sidecar and downstream services. The sidecar should not become a bottleneck for agent performance.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides