Temporal Validity Problem

The core problem

AI agents accumulate beliefs over time. The problem is not that agents store false data — it is that data that was true when stored can become false later, and the agent has no mechanism to detect or signal that transition.

Most memory systems choose one of two broken strategies:

Silent overwrite — the new value replaces the old one. History is destroyed. The agent cannot audit what it believed before, and cannot reason about when the change occurred.
Unconditional preserve — the old value is retained, but there is no flag or signal indicating that it might be stale. The agent reads it and acts on it as if it were current.

Neither approach is safe for long-running agents operating in a world where facts change.

A concrete example

t=0   Agent learns: user.city = "Berlin"   (source: user form, high confidence)
t=7d  User moves to Amsterdam — event not observed by the agent
t=8d  Agent asks: "where does the user live?"
      Memory returns: "Berlin"  ← wrong, but the agent has no way to know

The failure has nothing to do with the quality of the data at write time. The data was correct. The problem is the temporal validity gap: the claim’s valid period ended at some point between t=0 and t=8d, and the memory system has no representation for that ending.

Why stochastic extractors make this worse

LLM-based extractors are not recorders — they paraphrase, infer, and summarize. A paraphrased re-ingestion of a recalled falsehood looks like fresh evidence. Without a provenance firewall and a conflict-surfacing mechanism, a single stale belief can be re-amplified across many agent turns until it dominates memory. See Provenance Firewall for the defense against this.

The bi-temporal solution

mempill addresses temporal validity by maintaining two independent time axes on every claim:

Loading diagram…

Transaction-time (tx_time) is assigned by the engine at write time and is reliable and monotone. It answers: “when did the engine learn this?”
Valid-time (valid_time) is caller-supplied and confidence-tagged. It answers: “when was this true in the real world?” It is fallible: the caller may estimate it from context, and that estimate carries a valid_time_confidence score.

Because valid-time is a first-class dimension, mempill can surface “this claim’s valid window has ended” rather than silently returning a stale value. The two axes are kept deliberately separate because they answer different questions and are produced by different processes with different reliability.

What “Contested” means

When a new claim arrives that contradicts an existing one, mempill does not pick a winner. It surfaces Contested — an explicit signal that two claims disagree and no resolution has been reached. This is invariant I7.

Contested is not an error. It is the correct output when the system has received conflicting information and genuinely cannot determine which value reflects the current state of the world. Forcing a winner would mean the system is manufacturing certainty it does not have.

An oracle — human or automated — can adjudicate and produce a supersession that closes one valid-time window and opens another. See Oracle Resolution Loop for how that works.

The stochastic-truth boundary

mempill’s architecture rests on a hard distinction:

Stochastic proposers (LLM extractors, sensors, oracles) produce proposals. They may be wrong.
The deterministic core (gate C7, reconciler C3, truth engine C2) makes all commit decisions. It contains no model.

This means the system cannot manufacture truth from a stochastic source alone. A belief can be internally consistent, unchallenged by retrieval logic, yet factually false. When that happens, correctness pressure must come from outside — a human oracle, a tool-based verification, a sensor contradiction. The engine’s job is to organize claims honestly and surface ambiguity, not to judge truth. See Key Invariants for the full invariant set that enforces this boundary.

Next steps

Bi-temporal Claim Store — the data model behind the two axes
Contested and Dispositions — the 12-state model
The Adjudication Gate — what happens when claims conflict
Oracle Resolution Loop — how Contested claims get resolved