← Back to Learn
sdkguardrailscontent-safetytutorial

LlamaIndex Safety for RAG Pipelines

Authensor

Retrieval-Augmented Generation (RAG) pipelines built with LlamaIndex face unique safety challenges. Retrieved documents can contain prompt injection payloads. Queries can be crafted to extract sensitive information. Authensor adds safety layers at multiple points in the RAG pipeline to address these risks.

RAG-Specific Threats

Indirect prompt injection through documents is the primary risk. An attacker plants instructions in a document that gets indexed. When that document is retrieved, the instructions enter the model's context and can hijack the agent's behavior.

Data exfiltration through queries occurs when an attacker crafts queries designed to extract specific information from the index, probing for sensitive data that should not be accessible.

Poisoned retrieval manipulates the retrieval step to surface attacker-controlled content over legitimate results.

Query Scanning

Scan user queries with Aegis before they reach the retrieval engine. Check for injection patterns, unusually long queries (potential context stuffing), and queries that probe for specific document types or fields.

from authensor import AuthensorClient

client = AuthensorClient(url="https://your-instance.com")

scan_result = client.scan(content=user_query)
if scan_result.blocked:
    return "Query blocked by safety policy"

Retrieved Document Scanning

After retrieval and before the documents enter the model's context, scan each retrieved chunk for prompt injection payloads. This catches indirect injection attempts regardless of when the malicious document was indexed.

Authensor's Aegis scanner checks retrieved chunks for instruction-like content, role-play prompts, and other injection patterns. Flag or remove contaminated chunks before they reach the model.

Output Validation

After the model generates a response, validate that it does not contain information from restricted document categories. Use classification to check whether the response references documents the user should not have access to.

Index-Time Scanning

For the strongest defense, scan documents at indexing time. Flag documents that contain potential injection payloads. Store safety metadata alongside document embeddings so the retrieval step can filter out flagged documents.

Policy Integration

Define Authensor policies that govern RAG pipeline behavior. Restrict which document collections each user or agent can query. Set maximum retrieval counts to limit context size. Require that retrieved documents pass safety scanning before inclusion.

The combination of query scanning, document scanning, and output validation creates defense in depth for your RAG pipeline.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides