AI Guardrail Rule Tester

Rule Packs:
Rules (0)

No rules defined

Load a preset pack or add custom rules

Test Text

Actions

blockReject the entire message
flagAllow but flag for review
redactReplace matched text with [REDACTED]

What This Tool Does

AI Guardrail Rule Tester is built for deterministic developer and agent workflows.

Build and test AI guardrail rules with instant feedback — preset PII, injection, and safety patterns.

Use How to Use for execution steps and FAQ for constraints, policies, and edge cases.

Last updated:

This tool is provided as-is for convenience. Output should be verified before use in any production or critical context.

Agent Invocation

Best Path For Builders

Browser workflow

Runs instantly in the browser with private local processing and copy/export-ready output.

Browser Workflow

This tool is optimized for instant in-browser execution with local data handling. Run it here and copy/export the output directly.

/guardrail-rule-tester/

For automation planning, fetch the canonical contract at /api/tool/guardrail-rule-tester.json.

How to Use AI Guardrail Rule Tester

  1. 1

    Build a rule to block harmful outputs

    Create rule: pattern='jailbreak attempt' OR intent='ignore_instructions'. Add trigger words to match. Test against candidate outputs. If matched, rule blocks it. Use for content policy enforcement.

  2. 2

    Test guardrail rules against real LLM outputs

    Generate sample outputs from your LLM, paste into tester. Run each guardrail rule. See which rules trigger, why, and what to adjust. Catch false positives before production.

  3. 3

    Build cascading guardrails with rule priority

    Create multiple rules (safety, legal, brand). Assign priority: safety rules run first, brand rules last. Tester shows rule execution order and which rule blocked an output first.

  4. 4

    Use regex patterns for flexible matching

    Instead of exact strings, use regex patterns like 'password|api.?key|secret' to catch common sensitive data. Tester shows which patterns matched in your test output.

  5. 5

    Export rules as JSON config for deployment

    Once rules are working in tester, export as JSON. Deploy to your LLM server/middleware to enforce guardrails in production. Rules are deterministic, no additional LLM calls needed.

Frequently Asked Questions

What are AI guardrails?
Guardrails are rules that filter, flag, or redact content in AI inputs and outputs. They prevent PII leaks, prompt injection attacks, harmful content generation, and policy violations in production AI applications.
What preset rule packs are available?
PII detection (SSN, email, phone, credit card patterns), prompt injection patterns ('ignore previous instructions', role confusion), jailbreak attempts, and code injection detection. Load any pack quickly.
Can I export rules for NeMo Guardrails?
Yes, export your tested rules as JSON that can be adapted for NVIDIA NeMo Guardrails, LLM Guard, or any custom guardrail framework.
Does this tool detect prompt injection?
It tests YOUR rules against sample text. The preset injection patterns catch common attacks like 'ignore previous instructions', 'you are now', and role confusion attempts. You can customize and extend these patterns.
Is my test data private?
Privacy-first by design. All rule testing runs in your browser. No data is sent to any server. Your test content and rules stay on your machine.