Mnemom Trust Rating™

The Open Formula

Credit scores hide their math. We publish ours.

Mnemom Trust Ratings™ are a 0–1000 composite derived from five independently measured components. Every weight, every input, every calculation — visible and verifiable. Drag the sliders below to see exactly how it works.

trust-rating-simulator

The Formula

S = 0.40(820) + 0.20(750) + 0.20(880) + 0.10(650) + 0.10(700) = 789

Integrity Ratio40%

820+328

Compliance20%

750+150

Drift Stability20%

880+176

Trace Completeness10%

650+65

Coherence Compatibility10%

700+70

Composite

789

Reliable

Drag the sliders to see how each component affects the composite score

Five Components

Each component measures a different dimension of trustworthiness. Click any component to see exactly what it measures, how it's calculated, and why it's weighted the way it is.

What it measures

The percentage of real-time thinking analysis checkpoints that pass alignment verification. Every time an AI agent reasons through a decision, the Integrity Protocol evaluates whether that reasoning aligns with its stated values.

How it's calculated

Each checkpoint runs the agent's thinking block through independent analysis. Pass/fail verdicts are cryptographically signed and hash-chained. The ratio is simply: passes / total checkpoints.

Why this weight

At 40% weight, this is the single strongest signal. An agent that consistently passes integrity checks under real conditions — not synthetic benchmarks — demonstrates genuine alignment. This is the closest thing to "does this agent actually do what it says it does?"

What it measures

How well this agent stays within its declared boundaries. A high compliance score means few or no recent violations. A low score means frequent or recent boundary violations are dragging trust down.

How it's calculated

Violations are grouped by session — only the highest decayed impact per session counts (capped at 1.0). Each violation's impact decays as 2^(-age_hours/168), giving a 1-week half-life. The score is 1000 / (1 + total_session_impact)^1.5. Zero violations = 1000 (perfect compliance). More violation sessions, especially recent ones, push the score down — but a single bad session can't crater the score the way five independent violations would.

Why this weight

At 20% weight, compliance captures trajectory without letting one bad session be catastrophic. Session capping means a busted alignment card producing 5 false positives in 16 minutes counts as one event, not five. The power curve degrades meaningfully but preserves signal differentiation — an agent with real, repeated issues across many sessions scores very differently from one unlucky session.

What it measures

The ratio of operational sessions where the agent maintained consistent behavior without sustained behavioral drift. Drift means the agent's actual behavior diverged from its expected behavioral baseline.

How it's calculated

The Drift Detection system monitors behavioral patterns across sessions. A session with sustained drift (not momentary fluctuation — the system distinguishes) counts against this score. The ratio is: stable sessions / total sessions.

Why this weight

At 20% weight, stability matters because alignment isn't a one-time check — it's a continuous property. An agent might pass individual integrity checks but still gradually shift its behavior in concerning ways. Drift stability catches what point-in-time checks miss.

What it measures

A measure of audit trail quality — whether the agent is logging its decisions through the Accountability Protocol. Complete traces mean every significant decision has a verifiable record.

How it's calculated

The Accountability Protocol (AAP) expects trace entries for decisions, tool calls, and state transitions. Completeness is the ratio of actual trace entries to expected entries based on the agent's activity pattern.

Why this weight

At 10% weight, this is a hygiene factor. An agent with a perfect integrity score but incomplete traces raises questions — if you're not hiding anything, why aren't you logging? Incomplete traces don't prove wrongdoing, but complete traces prove transparency.

What it measures

How well this agent's values and behavior align with other agents it works alongside in multi-agent systems. Measured through the Fleet Coherence engine's pairwise compatibility analysis.

How it's calculated

When agents operate in fleets, the Coherence engine evaluates pairwise value alignment, conflict patterns, and resolution behaviors. The score reflects this agent's track record of productive multi-agent collaboration.

Why this weight

At 10% weight, this matters because agents increasingly work together. An individually trustworthy agent that consistently causes conflicts in multi-agent settings is a different risk profile than one that collaborates well. This is the "plays well with others" signal.

Grade Scale

Bond-rating inspired. Seven grades from AAA (Exemplary) to CCC (Critical), plus NR for agents that haven't yet met the 50-checkpoint minimum.

AAA

BBB

CCC

AAA

Exemplary

900–1000

Established

800–899

Reliable

700–799

BBB

Developing

600–699

Emerging

500–599

Concerning

400–499

CCC

Critical

200–399

Anti-Gaming Safeguards

Transparency doesn't mean exploitability. The scoring system includes several safeguards that make it resistant to manipulation.

Protocol-Verified Only

Only checkpoints generated through the Integrity Protocol's cryptographic pipeline are counted. You can't inject synthetic checkpoints — each one is Ed25519 signed and hash-chained.

Signature verification + chain hash validation

Session-Capped Decay

Compliance scoring groups violations by session and takes only the worst per session, then applies a 1-week half-life. This prevents a single bad session from compounding unfairly while still penalizing repeated violations across sessions.

score = 1000 / (1 + Σ max_impact_per_session)^1.5, decay half-life = 7 days

Minimum Checkpoint Threshold

Agents need at least 50 analyzed checkpoints before receiving a public score. This prevents flash-in-the-pan agents from gaming a high score on minimal data.

50 checkpoints minimum for NR → rated

Drift vs. Point Checks

The system measures both point-in-time integrity and sustained behavioral patterns. An agent that passes individual checks but drifts over time will still see its score reflect that instability.

Integrity ratio + drift stability = 60% combined

Why Open Scoring

Trust requires transparency

You can't build trust infrastructure on a black box. If we ask you to trust our scores, you should be able to verify exactly how they're computed.

Accountability goes both ways

We hold agents accountable to alignment standards. Publishing the methodology holds us accountable to fairness. If our weights are wrong, you can tell us.

Better signals, not secrets

Security through obscurity doesn't work for scoring systems — it just breeds suspicion. Our anti-gaming defenses come from cryptographic verification, not hidden formulas.

See the scoring system applied to real agents.

Browse Trust Directory See the Protocols in Action