Kubernetes is the natural home for AI safety infrastructure that needs to scale with your agent fleet. Authensor's control plane, policy engine, and monitoring components map cleanly to Kubernetes primitives. This guide covers the deployment architecture.
Deploy Authensor's control plane as a Deployment with a horizontal pod autoscaler. The control plane is stateless (all state lives in PostgreSQL), so it scales horizontally without coordination. Set resource requests based on your expected request volume. A single pod handles roughly 2,000 policy evaluations per second.
PostgreSQL runs as a StatefulSet or, preferably, as a managed database service outside the cluster. Audit receipt chains require durable storage with strong consistency guarantees that managed databases handle better than in-cluster PostgreSQL.
Create a dedicated namespace for your safety infrastructure. This provides isolation and makes RBAC configuration straightforward.
The control plane Deployment needs environment variables for database connection, API keys, and feature flags like AUTHENSOR_AEGIS_ENABLED and AUTHENSOR_SENTINEL_ENABLED. Store these in a Secret resource.
Expose the control plane through a ClusterIP Service for internal traffic or a LoadBalancer Service if agents run outside the cluster. Use network policies to restrict which pods can reach the safety API.
Configure liveness and readiness probes against the control plane's health endpoint. Set the readiness probe with a shorter interval since you want unhealthy pods removed from the service quickly.
Use a PodDisruptionBudget to ensure at least two replicas remain available during node drains and upgrades. Safety infrastructure should never have downtime.
The policy engine is CPU-bound. Aegis content scanning is the heaviest operation. If you use ML-based detection, consider running Aegis as a separate deployment with GPU-enabled nodes while keeping the policy engine on standard compute.
Set autoscaling thresholds based on request latency rather than CPU utilization. A p95 latency target of 50 milliseconds for policy evaluation is a reasonable starting point.
Monitor pod restart counts and OOM kills. Safety infrastructure failures should trigger immediate alerts through your existing Kubernetes monitoring stack.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides