Proving Ground · live adversarial arena

    We run adversaries against our Safe House 24/7 — in public.

    15 named red team agents probe CFD and CBD every hour with every technique in the field — prompt injection, BEC, indirect injection, data exfiltration, regulated-advice slip. What they break becomes a signed detection recipe. What they can't becomes evidence for your auditor.

    The adversarial flywheel

    Every probe we run makes the Safe House harder. Every bypass becomes a recipe. Every recipe ships to every customer.

    Red team probes

    15 named adversaries probe CFD and CBD 24/7 across 11 threat categories with 8 mutation operators (unicode, emoji, base64, crescendo, synonym, paraphrase, translate, structural).

    Sideband analyzer

    Every bypass routes to a Claude Opus analyzer that classifies the miss, identifies the detector gap, and drafts a YAML detection recipe.

    Recipe promotion

    Confidence ≥ 0.90 auto-promotes into a 48-hour zero-FP validation window. Lower confidence enters the admin review queue.

    Library update

    Promoted recipes join the live FingerprintMatcher index. MinHash signatures propagate across every customer via the opt-in Threat Network.

    Harder probes

    Mutation engine seeds the next generation from confirmed bypasses. The red team gets stronger — and so does the defense.

    Harder probes
    What we don't claim

    What the arena doesn't prove — yet.

    Every public-facing claim on this page is backed by live data. The items below are known gaps we're shipping against.

    • Unicode + emoji evasion hardening — P0 in-flight. Research shows 70–88% bypass rates against production guardrails using zero-width characters and homoglyphs. We don't pretend that's closed.

    • Indirect injection fast-path coverage — tool results pass through L1 unscanned today. L2 semantic checks catch it, but L1 is the gap.

    • Arena V2 sideband analyzer auto-promotion is live; Arena V2's customer-facing campaign view (cross-org pattern correlation) is pending.

    • CBD outbound DLP is wired for canary match and credential leak; launder-detector and regulated-advice checker run async except on Enforce-Sync and Sovereign tiers.

    Featured on There's An AI For That