Skip to content

Provenance Firewall

Consider this loop:

  1. Agent ingests a claim: user.city = "Berlin".
  2. Agent reads its memory and receives "Berlin" back.
  3. Agent re-ingests what it read as if it were new evidence: "Berlin" again, with the same or higher confidence.
  4. The corroboration count increments. Currency rises. Confidence inflates.
  5. Repeat for every recall turn.

After 808 turns (the real-world mem0 issue #4573 case), the agent has 808 copies of the same claim, each with inflated corroboration. A single stale or wrong belief becomes the most “confirmed” fact in the store — because it was recalled the most, not because it was verified externally.

mempill prevents this with two mechanisms: a typed provenance label (I4) that is immutable after commit, and an amplification guard (C6) that detects re-entry and collapses it to a single idempotent corroboration.

Every claim in mempill carries a ProvenanceLabel assigned at injection time. The label is immutable — no operation can change it after the claim is committed (invariant I4). The Rust type system enforces this: there is no set_provenance() method.

The three channels are (source: mempill-types/src/provenance.rs, mempill-python/python/mempill/types.py):

Channel Wire shape Meaning Gate routing
External(UserAsserted) {"type": "External", "kind": "UserAsserted"} First-hand human assertion. The user is acting as an oracle. Cheap path eligible
External(ExternalFirstHand) {"type": "External", "kind": "ExternalFirstHand"} First-hand external evidence: tool result, system-of-record output, sensor reading, oracle verdict. Cheap path eligible
RecallReEntry {"type": "RecallReEntry"} Content the engine previously served, re-entering the write path. Caught by C6; corroborates by identity; never becomes ground truth
ModelDerived {"type": "ModelDerived"} Model-emitted or inferred content. Mandatory default for model output. Down-weighted; cannot overturn External

In Python:

from mempill.types import ProvenanceLabel
# First-hand user assertion (human as oracle):
prov = ProvenanceLabel.external_user_asserted()
# → {"type": "External", "kind": "UserAsserted"}
# Tool result, system-of-record, sensor:
prov = ProvenanceLabel.external_first_hand()
# → {"type": "External", "kind": "ExternalFirstHand"}
# Engine output re-entering the write path (recall loop):
prov = ProvenanceLabel.recall_re_entry()
# → {"type": "RecallReEntry"}
# Model output (always use this for LLM-generated content):
prov = ProvenanceLabel.model_derived()
# → {"type": "ModelDerived"}

The AmplificationGuard is the firewall component (C6) that detects recall re-entry. When a claim with RecallReEntry provenance arrives:

  1. C6 computes the identity key: (subject, predicate, value, external_anchor_ref).
  2. C6 looks up the existing claim by identity.
  3. If found: corroborate by identity — return the existing ClaimRef, increment currency if the new entry’s provenance is independent. No new claim is created.
  4. If not found: treat as novel. (This handles the case where the recalled claim was not previously stored under the same identity.)

The identity collapse means 808 identical recall re-entries produce exactly one claim with a single corroboration record — not 808 claims with inflated corroboration. The mem0 #4573 amplification loop cannot happen.

C6 also detects burst patterns: a rapid sequence of claims from the same source that exceed the configured quarantine_burst_threshold (default: 10 identical claims in one batch). Claims matching the burst signature are routed to Quarantined — auditable in the ledger but not committed to the active belief.

Burst quarantine is the defense against loop malfunctions: if an agent enters a tight loop ingesting the same claim repeatedly, the firewall parks the claims rather than corrupting the store.

Derivation-depth cap (provenance-laundering containment)

Section titled “Derivation-depth cap (provenance-laundering containment)”

Every claim carries a derivation_depth — its distance in inference hops from the nearest first-hand external anchor. This field is set at injection time and is part of the immutable claim record.

Provenance laundering is the harder amplification case: a model infers or paraphrases from a recalled falsehood, producing a syntactically-new claim that looks like fresh evidence. The amplification guard cannot detect this by content alone (the paraphrase is not byte-identical).

The containment strategy:

  1. Injection-time provenance tagging: content tagged RecallReEntry when it entails an earlier injection. Tagging is immutable.
  2. Derivation-depth cap: claims with derivation_depth > config.derivation_depth_cap_for_currency_boost (default: 3) are ineligible for currency boosts and cannot overturn external beliefs. Ungrounded derived chains self-limit.
  3. Audit trail: the full lineage graph (claim_edges) and ledger of gate decisions are retained. Operator inspection can trace the propagation of a recalled falsehood.

The provenance firewall does not solve the stable-but-wrong problem. A belief can be:

  • Internally consistent (no contradiction detected)
  • External-anchored (tagged External(UserAsserted))
  • Uncontested (no other claim contradicts it)
  • Yet factually wrong

The system cannot detect this automatically. Correctness pressure must come from outside: a human oracle explicitly asserting a corrective claim, a tool-based verification, or a sensor contradiction. The provenance firewall limits amplification; it cannot manufacture truth from a stochastic source.

import mempill
from mempill.types import ProvenanceLabel
engine = mempill.open_in_memory()
# Ingest a first-hand human assertion.
resp = engine.ingest_claim({
"agent_id": "my-agent",
"subject": "user",
"predicate": "city",
"value": "Berlin",
"provenance": ProvenanceLabel.external_user_asserted(),
"cardinality": "Functional",
"confidence": {"value_confidence": 1.0, "valid_time_confidence": 0.0},
"criticality": "Medium",
"derived_from": [],
})
print(resp["disposition"]) # CommittedCheap
# Simulate a recall re-entry (agent reads its memory and re-ingests it).
resp2 = engine.ingest_claim({
"agent_id": "my-agent",
"subject": "user",
"predicate": "city",
"value": "Berlin",
"provenance": ProvenanceLabel.recall_re_entry(), # ← correct label for recall
"cardinality": "Functional",
"confidence": {"value_confidence": 1.0, "valid_time_confidence": 0.0},
"criticality": "Medium",
"derived_from": [],
})
# The amplification guard collapses this to a corroboration of the existing claim.
# No new claim is created; the existing claim_ref is returned.
print(resp2["claim_ref"] == resp["claim_ref"]) # True
  • I4 — Provenance immutable: set at injection time; no operation can rewrite it. The Rust type system has no set_provenance() method.
  • I5 — Stochastic proposes, never commits: ModelDerived claims are down-weighted and cannot overturn external beliefs.
  • I6 — Idempotent append: RecallReEntry is idempotent; N re-entries = 1 corroboration record.