Open-source · MIT licensed

Free AI safety stack.
Frontier adversarial red teaming.

Gate every agent action, scan every input, and keep a tamper-evident audit trail. Open-source, self-hostable, and hardened by frontier adversarial research across the AI/ML ecosystem.

MIT Licensed·Self-hostable·18 packages on npm
Track record

We break AI safety systems.
Then we build better ones.

Offensive security methodology applied to AI safety evaluation. Find the gaps. Document them. Ship the fix.

NVIDIA · Microsoft · Meta · Google
HuggingFace · OpenAI · PyTorch · vLLM · Ray

01

Systematic audit of AI/ML infrastructure

ongoing

Adversarial analysis across NVIDIA, Microsoft, Meta, Google, HuggingFace, OpenAI, and dozens of other production ML stacks.

02

Two novel vulnerability classes

disclosing

Previously undocumented classes in model serialization formats and sandboxed code execution. Details pending coordinated disclosure.

03

Disclosures across the ML ecosystem

coordinated

Critical and high-severity findings in PyTorch, DeepSpeed, Ray, Ollama, vLLM, and LangChain, coordinated with maintainers.

04

ControlArena contributor — UK AISI

merged

Merged PR #798 to UK AISI's ControlArena: agents evading their own safety monitors via prompt injection.

Red teaming

We attack the systems
meant to keep AI safe.

The pipeline we point at the AI/ML ecosystem, pointed at your systems. Scoped engagements, CVE-quality findings, full reproduction steps.

01ML stack

ML Infrastructure Audit

Adversarial analysis of your ML stack — deserialization, injection, auth bypass, model-format exploits, supply chain. The methodology we use to break production systems at NVIDIA, Microsoft, Meta, and Google, pointed at yours.

02Evaluators

AI Safety Evaluation Testing

We test the evaluators. Monitor bypass, compound-judge failures, signal dilution, sandbox escapes. Your safety infrastructure is an attack surface — we prove it before an adversarial agent does.

03Agents

Agent Red Teaming

Systematic adversarial campaigns against your agents and tool integrations. Privilege escalation, exfiltration, goal hijacking, memory poisoning. CVE-quality findings with reproduction steps.

Engagement modelscope test report · 2–4 weeks
Book a Scoping Call
Free safety stack

The tooling we built to do our job.
Now yours.

Free, MIT-licensed packages forged in our own adversarial research. Download the full stack, or install only the tools you need.

$npx @authensor/create-authensor my-agentcopy
INTEGRATION

One line. Full safety.

Wrap any agent action with guard() and policy evaluation, content scanning, and audit logging happen automatically.

terminal
# Download the full safety stack
npx @authensor/create-authensor my-agent
cd my-agent && npm install

# Or install individual tools:
npm install @authensor/aegis        # Content scanner
npm install @authensor/sentinel     # Behavioral monitor
npm install @authensor/engine       # Policy engine
npm install @authensor/mcp-server   # MCP Gateway
npm install @authensor/redteam      # Red team harness
WORKS WITH
LangChainOpenAICrewAIClaudeVercel AI SDKMCP
WHO WE ARE

Adversarial AI safety researchers.

We apply offensive security methodology to AI safety evaluation. Penetration testing for guardrails. Red teaming for agents. Adversarial probing for classifiers.

The safety stack is our toolkit, open-sourced. The red teaming is what we do with it.

195+ repos audited. 350+ verified vulnerabilities. 166 responsible disclosures. 2 novel vulnerability classes discovered. ControlArena contributor (UK AISI).

Two ways to work with us.

Download the free safety stack. Or hire the team that built it to red team your systems.