Human-in-the-Loop Oracle

Overview

The human-in-the-loop (HITL) oracle is a reference oracle implementation that routes contested claims to a human reviewer rather than an automated resolver. It implements the OraclePort duck-typed protocol in Python and is the pattern used by the mempill demo’s /review command.

HITL is appropriate when:

The claim concerns high-criticality facts where automated resolution is unsafe (e.g., medication allergies, financial status).
No automated oracle is available or confident enough.
The belief is Contested and the correct value must come from a human with domain knowledge.

How it works

The HITL oracle is stateless. It does not maintain its own queue. When the engine calls request_adjudication(), the oracle generates a correlation UUID and returns it immediately. The engine writes the full AdjudicationRequest (including both conflicting values) to the durable pending_adjudications table. The review UI reads from that table — not from the oracle object.

[Conflict detected by engine]
        │
        ▼
oracle.request_adjudication() called
        │
        ▼
HumanOracle generates handle_id (UUID), returns immediately
        │
        ▼
Engine writes {handle_id, incumbent_value, challenger_value, ...}
to pending_adjudications DB table (durable across restart)
        │
        ▼
[Out-of-band review]
  Human calls engine.list_pending_adjudications()
  Human inspects incumbent vs. challenger values
  Human picks challenger / incumbent / skip / abstain
        │
        ▼
engine.submit_adjudication(handle_id, verdict)
        │
        ▼
Disposition flips atomically (I9); handle consumed

The four review choices

When a human reviews a pending conflict, four choices are available:

Choice	Verdict sent	Engine behavior
challenger (`c`)	`Affirm`	Challenger value wins; incumbent bounded to `Superseded`. Resolution carries `External(ExternalFirstHand)` provenance.
incumbent (`i`)	`Deny`	Incumbent confirmed; challenger bounded to `Superseded`.
skip (`s`)	(none — defer)	Handle left pending. No engine call. Reappears on next `/review`.
abstain (`a`)	`Unknown`	Undecidable; handle consumed; both claims remain `Contested`. Removed from the queue.

Verdicts carry External authority

Human verdicts are submitted with evidence_provenance: External(ExternalFirstHand). This is intentional: the human is acting as an external authority resolving an ambiguous fact. The resolved claim carries maximum external weight — it can overturn future conflicting claims via the normal gate logic without re-escalating to the oracle.

UserAsserted provenance is for claims ingested directly via ingest_claim. ExternalFirstHand is for oracle-resolved claims, including HITL verdicts.

Durable queue across restart

The review queue survives process restarts because the pending_adjudications table is stored in the SQLite file. HumanOracle itself is stateless — it holds no in-memory dict. After a restart:

engine.list_pending_adjudications() reads the DB directly and returns all status='pending' rows.
The review queue is fully intact — the human sees the same pending items as before the restart.

Using HITL in Python

import mempill
from mempill_demo.adapters.human_oracle import HumanOracle

# Wire the HITL oracle at construction time.
oracle = HumanOracle()
engine = mempill.open_oracle("/path/to/agent.db", oracle)

# Ingest a conflicting claim — oracle is present, so it goes to QueuedForAdjudication.
resp = engine.ingest_claim({
    "agent_id": "my-agent",
    "subject": "acme:ceo",
    "predicate": "held_by",
    "value": "Bob",
    "provenance": {"type": "External", "kind": "UserAsserted"},
    "cardinality": "Functional",
    "confidence": {"value_confidence": 0.9, "valid_time_confidence": 0.0},
    "criticality": "High",
    "derived_from": [],
})
print(resp["disposition"])  # QueuedForAdjudication

# Review pending conflicts.
pending = engine.list_pending_adjudications(agent_id="my-agent")
for item in pending:
    print(f"{item['predicate']}: {item['incumbent_value']} vs {item['challenger_value']}")
    # Human decision: challenger wins.
    outcome = engine.submit_adjudication({
        "handle_id": item["handle_id"],
        "verdict": "Affirm",
        "evidence_provenance": {"type": "External", "kind": "ExternalFirstHand"},
    })
    print(outcome["disposition"])  # CommittedCheap (challenger won)

The `/review` command in the demo

The mempill demo exposes a /review REPL command that drives the review loop interactively:

> /review

2 conflict(s) awaiting review:

[1] predicate: held_by
    Incumbent: Alice
    Challenger: Bob
    Queued: 2026-06-24T10:15:00Z
    [c]hallenger / [i]ncumbent / [s]kip-ask-later / [a]bstain? c
    Resolved: CommittedCheap  (Bob wins; Alice → Superseded)

[2] predicate: city
    Incumbent: Berlin
    Challenger: Munich
    Queued: 2026-06-24T11:30:00Z
    [c]hallenger / [i]ncumbent / [s]kip-ask-later / [a]bstain? s
    Deferred — will ask again next /review

Review complete.

TTL and sweep

If you configure a default_adjudication_ttl on the engine, expired handles are swept by sweep_expired_adjudications(), which reverts them to Contested. The demo calls this on startup:

n = engine.sweep_expired_adjudications()
if n > 0:
    print(f"[startup] Reverted {n} expired adjudication(s) to Contested.")

With no TTL configured (the demo default), pending items stay in the queue indefinitely until a human resolves or abstains.

LangGraph integration

The mempill LangGraph agent supports /review at the REPL level (intercepted before graph.invoke()), so the review loop does not block or interrupt the conversation graph. The graph’s retrieve_memory node surfaces a banner when pending conflicts exist:

[2 unresolved conflict(s) pending. Type /review to resolve.]

The graph never calls submit_adjudication automatically — human judgment is required. Only the REPL-level /review command drives resolution.

Key invariants

I7 — Contested first-class: abstain → Unknown → both claims stay Contested; never silently picks the incumbent.
I9 — Atomic commit unit: submit_adjudication acquires the per-agent write lock; the resolution is atomic.
I4 — Provenance immutable: HITL verdicts write External(ExternalFirstHand) provenance at resolution time; immutable thereafter.
I1 — Non-destruction: incumbent and challenger are never deleted. After Deny, the challenger is Superseded but remains in history.

Next steps

Oracle Resolution Loop — the engine-side mechanics behind HITL
Contested and Dispositions — the full disposition model
The Adjudication Gate — what routes a claim to QueuedForAdjudication