The Safe House

    The Safe House.

    Every message crossing your agent perimeter is screened, classified, and cryptographically signed. Inbound prompt injection, social engineering, and coercion are blocked before the agent ever sees them. Outbound PII, secrets, and alignment-card violations are caught before they leave.

    What the Safe House protects against.

    Two detector families, one perimeter. Inbound and outbound screening with signed verdicts on every decision.

    Inbound

    CFD — Content-Flow Detection

    Every message reaching your agent is screened for adversarial intent. Prompt injection, social engineering, and coerced tool calls are detected and blocked before the agent processes them.

    • Prompt injection
      • Hidden instructions in retrieved documents or tool responses.
      • Role-swap attacks ("ignore previous instructions…").
      • Fresh-template injections tracked via the Learning Network fingerprint corpus.
    • Social engineering
      • CEO-fraud style requests impersonating authorized users.
      • Urgency and authority-pressure patterns designed to skip approvals.
      • Indirect coaxing that tries to reshape the agent's declared scope.
    • Context poisoning
      • Manipulated memory or vector-store payloads.
      • Poisoned tool responses that carry hidden follow-on instructions.
      • Adversarial summaries fed back into long-horizon plans.
    • Tool-call coercion
      • Attempts to force calls outside the Alignment Card's permitted scope.
      • Chained tool calls that smuggle an unauthorized action.
      • Argument-shape attacks targeting under-validated tool schemas.
    Outbound

    CBD — Content-Boundary Detection

    Every agent response is screened against PII, secrets, and Alignment Card boundaries before it leaves your perimeter. An unredacted leak cannot produce a signed certificate.

    • PII and PHI leakage
      • Customer records, SSNs, and card numbers — including split-token patterns.
      • Protected health information in regulated flows.
      • Cross-tenant data that would cross your Alignment Card's boundary.
    • Secrets and credentials
      • API keys, access tokens, and signed URLs.
      • Internal connection strings and infrastructure identifiers.
      • Bearer tokens echoed back in error traces.
    • Alignment Card violations
      • Responses that exceed the agent's declared scope.
      • Actions that skip the card's escalation contract.
      • Outputs that contradict compliance obligations on the card.
    • Regulated advice
      • Legal, medical, or financial guidance outside the agent's license.
      • Recommendations that bypass human-in-the-loop clauses.
      • Decisions implicating Article 50 transparency obligations.

    One perimeter. Signed verdicts both directions.

    CFD on the way in. Runtime AIP in the middle. CBD on the way out. Every verdict is Ed25519-signed and hash-chained.

    Inbound
    CFD
    Screens every inbound message before the agent processes it.
    Runtime
    AIP runtime
    Bound to the Alignment Card; produces the signed proof chain.
    Outbound
    CBD
    Screens every outbound response before it leaves your perimeter.
    Every verdict is Ed25519-signed, hash-chained, and exportable as evidence — not just observed.

    Built for regulated deployments.

    The Safe House runs inside your compliance envelope. Verdicts, attestations, and audit bundles are designed to map to the standards your regulator already asks about.

    SOC 2 readiness
    EU AI Act Article 50
    HIPAA-compatible flows
    Ed25519 signed verdicts

    A perimeter that is accountable, not just observed.

    The Safe House is not a logging product. It's a signed, enforceable boundary around every agent. Pair it with an Alignment Card and you can prove what crossed and what didn't.

    Featured on There's An AI For That