← Back to Learn
policy-enginebest-practicestutorial

Property-Based Testing for AI Policies

Authensor

Property-based testing verifies that a system satisfies certain properties across a wide range of randomly generated inputs. Instead of writing specific test cases with specific expected results, you define properties that must hold for all inputs and let the testing framework generate thousands of test cases automatically.

Properties for Safety Policies

Several properties should hold for any well-formed safety policy:

Determinism: The same envelope evaluated against the same policy should always produce the same result. This property verifies that the engine has no hidden state or race conditions.

Fail-closed default: Any envelope with an unrecognized action type should be denied. This property verifies that the default-deny behavior works for all possible action strings.

Deny dominance: If a deny rule matches an envelope, the result should be deny regardless of any allow rules that also match. This property verifies the conflict resolution strategy.

Monotonic restriction: Adding a deny rule to a policy should never cause a previously denied action to become allowed. This property verifies that deny rules do not interact in unexpected ways.

Implementation

Using a property-based testing library (like fast-check for TypeScript):

import fc from 'fast-check';
import { PolicyEngine } from '@authensor/engine';

const policyArb = fc.record({
  default_effect: fc.constant('deny'),
  rules: fc.array(ruleArbitrary, { minLength: 0, maxLength: 20 }),
});

const envelopeArb = fc.record({
  action: fc.stringOf(fc.constantFrom(...actionChars)),
  principal: fc.string(),
  resource: fc.string(),
});

test('determinism: same input always gives same result', () => {
  fc.assert(
    fc.property(policyArb, envelopeArb, (policy, envelope) => {
      const engine = new PolicyEngine(policy);
      const result1 = engine.evaluate(envelope);
      const result2 = engine.evaluate(envelope);
      return result1.effect === result2.effect;
    })
  );
});

Shrinking

When a property-based test finds a failing case, the framework automatically shrinks the input to the smallest case that still fails. This makes debugging much easier because you see the minimal input that triggers the problem rather than a large, complex random input.

Combining with Unit Tests

Property-based tests complement unit tests. Unit tests verify specific known scenarios. Property-based tests explore the space of possible inputs to find scenarios you did not think of. Use both.

When Properties Fail

A failing property indicates a fundamental problem: the system does not satisfy a correctness invariant. Fix these issues with high priority because they represent systematic vulnerabilities, not edge cases.

Property-based testing finds the bugs you did not know you should look for.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides