Mnemom Trust Rating™

    The Open Formula

    Credit scores hide their math. We publish ours.

    Mnemom Trust Ratings™ are a 0–1000 composite derived from five independently measured components. Every weight, every input, every calculation — visible and verifiable. Drag the sliders below to see exactly how it works.

    trust-rating-simulator
    The Formula
    S = 0.40(820) + 0.20(750) + 0.20(880) + 0.10(650) + 0.10(700) = 789
    Integrity Ratio40%
    820+328
    Compliance20%
    750+150
    Drift Stability20%
    880+176
    Trace Completeness10%
    650+65
    Coherence Compatibility10%
    700+70
    Composite
    789
    A
    Reliable
    Drag the sliders to see how each component affects the composite score

    Five Components

    Each component measures a different dimension of trustworthiness. Click any component to see exactly what it measures, how it's calculated, and why it's weighted the way it is.

    What it measures

    The percentage of real-time thinking analysis checkpoints that pass alignment verification. Every time an AI agent reasons through a decision, the Integrity Protocol evaluates whether that reasoning aligns with its stated values.

    How it's calculated

    Each checkpoint runs the agent's thinking block through independent analysis. Pass/fail verdicts are cryptographically signed and hash-chained. The ratio is simply: passes / total checkpoints.

    Why this weight

    At 40% weight, this is the single strongest signal. An agent that consistently passes integrity checks under real conditions — not synthetic benchmarks — demonstrates genuine alignment. This is the closest thing to "does this agent actually do what it says it does?"

    What it measures

    How well this agent stays within its declared boundaries. A high compliance score means few or no recent violations. A low score means frequent or recent boundary violations are dragging trust down.

    How it's calculated

    Violations are grouped by session — only the highest decayed impact per session counts (capped at 1.0). Each violation's impact decays as 2^(-age_hours/168), giving a 1-week half-life. The score is 1000 / (1 + total_session_impact)^1.5. Zero violations = 1000 (perfect compliance). More violation sessions, especially recent ones, push the score down — but a single bad session can't crater the score the way five independent violations would.

    Why this weight

    At 20% weight, compliance captures trajectory without letting one bad session be catastrophic. Session capping means a busted alignment card producing 5 false positives in 16 minutes counts as one event, not five. The power curve degrades meaningfully but preserves signal differentiation — an agent with real, repeated issues across many sessions scores very differently from one unlucky session.

    What it measures

    The ratio of operational sessions where the agent maintained consistent behavior without sustained behavioral drift. Drift means the agent's actual behavior diverged from its expected behavioral baseline.

    How it's calculated

    The Drift Detection system monitors behavioral patterns across sessions. A session with sustained drift (not momentary fluctuation — the system distinguishes) counts against this score. The ratio is: stable sessions / total sessions.

    Why this weight

    At 20% weight, stability matters because alignment isn't a one-time check — it's a continuous property. An agent might pass individual integrity checks but still gradually shift its behavior in concerning ways. Drift stability catches what point-in-time checks miss.

    What it measures

    A measure of audit trail quality — whether the agent is logging its decisions through the Accountability Protocol. Complete traces mean every significant decision has a verifiable record.

    How it's calculated

    The Accountability Protocol (AAP) expects trace entries for decisions, tool calls, and state transitions. Completeness is the ratio of actual trace entries to expected entries based on the agent's activity pattern.

    Why this weight

    At 10% weight, this is a hygiene factor. An agent with a perfect integrity score but incomplete traces raises questions — if you're not hiding anything, why aren't you logging? Incomplete traces don't prove wrongdoing, but complete traces prove transparency.

    What it measures

    How well this agent's values and behavior align with other agents it works alongside in multi-agent systems. Measured through the Fleet Coherence engine's pairwise compatibility analysis.

    How it's calculated

    When agents operate in fleets, the Coherence engine evaluates pairwise value alignment, conflict patterns, and resolution behaviors. The score reflects this agent's track record of productive multi-agent collaboration.

    Why this weight

    At 10% weight, this matters because agents increasingly work together. An individually trustworthy agent that consistently causes conflicts in multi-agent settings is a different risk profile than one that collaborates well. This is the "plays well with others" signal.

    Grade Scale

    Bond-rating inspired. Seven grades from AAA (Exemplary) to CCC (Critical), plus NR for agents that haven't yet met the 50-checkpoint minimum.

    AAA
    AA
    A
    BBB
    BB
    B
    CCC
    AAA
    Exemplary
    9001000
    AA
    Established
    800899
    A
    Reliable
    700799
    BBB
    Developing
    600699
    BB
    Emerging
    500599
    B
    Concerning
    400499
    CCC
    Critical
    200399

    Anti-Gaming Safeguards

    Transparency doesn't mean exploitability. The scoring system includes several safeguards that make it resistant to manipulation.

    Protocol-Verified Only

    Only checkpoints generated through the Integrity Protocol's cryptographic pipeline are counted. You can't inject synthetic checkpoints — each one is Ed25519 signed and hash-chained.

    Signature verification + chain hash validation

    Session-Capped Decay

    Compliance scoring groups violations by session and takes only the worst per session, then applies a 1-week half-life. This prevents a single bad session from compounding unfairly while still penalizing repeated violations across sessions.

    score = 1000 / (1 + Σ max_impact_per_session)^1.5, decay half-life = 7 days

    Minimum Checkpoint Threshold

    Agents need at least 50 analyzed checkpoints before receiving a public score. This prevents flash-in-the-pan agents from gaming a high score on minimal data.

    50 checkpoints minimum for NR → rated

    Drift vs. Point Checks

    The system measures both point-in-time integrity and sustained behavioral patterns. An agent that passes individual checks but drifts over time will still see its score reflect that instability.

    Integrity ratio + drift stability = 60% combined

    Why Open Scoring

    Trust requires transparency

    You can't build trust infrastructure on a black box. If we ask you to trust our scores, you should be able to verify exactly how they're computed.

    Accountability goes both ways

    We hold agents accountable to alignment standards. Publishing the methodology holds us accountable to fairness. If our weights are wrong, you can tell us.

    Better signals, not secrets

    Security through obscurity doesn't work for scoring systems — it just breeds suspicion. Our anti-gaming defenses come from cryptographic verification, not hidden formulas.

    See the scoring system applied to real agents.

    Featured on There's An AI For That