New paperGrounding Promises in the Sandbox: an environment-grounded commitment protocol for trained autonomous agents.Read now
All papers
White paperTrust & safetyAgentic AIArchitecture

Grounding Promises in the Sandbox: An Environment-Grounded Commitment Protocol for Trained Autonomous Agents

The Verifiable Commitment Protocol can prove an agent made a promise, but not always decide whether the promise was kept. This paper extends VCP into the Environment-Grounded Commitment Protocol — using the sandboxed training environment itself as the trusted oracle that decides whether a commitment was discharged, and turns that verdict into an attributable training signal.

Teleperson Team · May 2026 · 16 min read

A companion protocol, the Verifiable Commitment Protocol (VCP), gives autonomous agents a way to form cryptographically attributable, non-repudiable commitments, but it carries an explicit grounding limitation: among mutually distrustful agents, the discharge of a commitment over a non-self-evident world fact cannot be decided without trusting an external oracle, which VCP can only minimize and name. Independently, the agentic-modeling literature has converged on the environment service as the central abstraction for training and evaluating agents—a lightweight, sandboxed, reproducible service with programmatic ground-truth checks, exemplified by Microsoft Research’s Orchard framework and its Orchard Env component. We observe that such an environment is precisely the object VCP’s grounding limitation says cannot generally be assumed: a sound, total, replayable discharge oracle for the commitments exercised within it. We exploit this to extend VCP into the Environment-Grounded Commitment Protocol (EGCP), an oracle-parameterized protocol in which (i) the environment is a first-class participant, the Grounding Oracle, whose signed attestations decide commitment antecedents and consequents; (ii) every commitment-lifecycle transition is bound to a position in the agent’s trajectory, yielding a signed commitment-annotated trajectory; (iii) the decidable per-commitment verdict inherited from VCP becomes a dense, attributable credit-assignment signal—a principled, automatic instance of process supervision; and (iv) the protocol degrades exactly to baseline VCP, re-inheriting its grounding limitation, when the environment oracle is absent, so a single signed commitment ledger is meaningful both in training (strong oracle) and at deployment (weak oracle). We prove that EGCP is a conservative extension of VCP (no VCP guarantee is weakened, for any oracle), that under a total environment oracle its residual trust is zero (a strict strengthening VCP cannot achieve in the open setting), that the verdict-derived training signal is attributable and non-repudiation-safe, and that ledgers are sound under oracle refinement across the train→deploy boundary. We delimit the irreducible residual—environment fidelity—and instantiate EGCP concretely on the three Orchard recipes (coding, GUI, personal assistant), reviewing that framework and the credit assignment problem it addresses.

1. Introduction

Two research programs have advanced in parallel without meeting. The first asks how autonomous agents can make agreements that mean something: the companion Verifiable Commitment Protocol (VCP) gives agents typed, conditional, delegated commitment objects with a cryptographically attributable, non-repudiable lifecycle, on top of agent interoperability protocols [1, 9, 10]. The second asks how agents are built: the agentic-modeling literature now treats a sandboxed environment service as the load-bearing abstraction for producing training data, running rollouts, and scoring outcomes. Microsoft Research’s Orchard is a recent, concrete instance: an open-source framework whose core, Orchard Env, is described as a lightweight environment service providing reusable primitives for sandbox lifecycle management across task domains, agent harnesses, and pipeline stages, on top of which three training recipes are built for coding, GUI navigation, and personal-assistant agents [2].

These programs share a hidden seam. VCP’s central theoretical concession is its grounding limitation: among mutually distrustful agents, whether a commitment whose consequent is a non-self-evident world fact has been discharged cannot be decided from the protocol transcript alone; a correct verdict requires an external attestation, and VCP can only confine and name that trust, not remove it [1]. Now consider what an environment service is. A sandbox with programmatic checks—did the test suite go from fail to pass, did the DOM reach the target state, did the assistant task complete—is a sound, total, and replayable decision procedure for exactly the kind of atomic facts VCP cannot otherwise ground. The object VCP’s impossibility result excludes in the open world is, in the training and evaluation setting, sitting right there as infrastructure.

This paper extends VCP by taking that observation seriously. We define the Environment-Grounded Commitment Protocol (EGCP), an oracle-parameterized protocol that makes the environment a first-class protocol participant—the Grounding Oracle—and binds each commitment-lifecycle transition to a position in the agent’s trajectory, the observation/action sequence that agentic frameworks already record and train on. Three consequences follow. First, within an environment that is a total oracle for the commitment content language, EGCP’s residual trust drops to zero: a strict strengthening of VCP that VCP provably cannot achieve in the open setting. Second, VCP’s decidable-verdict property becomes a dense, attributable credit-assignment signal: each environment-attested discharge or breach is a localized, non-repudiable learning signal anchored to the trajectory segment that produced it—an automatic, principled instance of process supervision, which is known to outperform sparse outcome supervision on long multi-step tasks [4, 5], and a formal footing for the “learn from productive segments of unresolved trajectories” heuristic Orchard reports under credit-assignment SFT [2]. Third, because EGCP is parameterized by the oracle and degrades exactly to VCP when the oracle is null, the same signed commitment ledger is valid in training (strong oracle) and at deployment (weak or named oracle): a continuity of “what the agent committed to” across the train→deploy boundary that neither program currently provides.

Contributions.

  • A review of the agentic-environment abstraction as exemplified by Orchard, and an identification of the seam between it and VCP’s grounding limitation (Sections 1, 2, 6).
  • EGCP: an oracle-parameterized extension of VCP in which the environment is the Grounding Oracle and commitments are bound to trajectory positions, with an explicit oracle ordering and a definition of the commitment-annotated trajectory (Sections 3, 4).
  • Theorems: EGCP is a conservative extension of VCP (no VCP guarantee weakened, any oracle); under a total environment oracle the grounding residual is zero (strict strengthening); the verdict-derived credit signal is attributable and non-repudiation-safe; and ledgers are sound under oracle refinement across train→deploy (Section 5).
  • A concrete instantiation on the three Orchard recipes and a delimitation of the irreducible residual—environment fidelity, i.e. the specification-gaming / sim-to-real gap—relating it to multi-agent risk (Sections 6, 7).

This is a formal and architectural contribution; we make no empirical claims. Our engagement with Orchard is at the level of its publicly described architecture, recipes, and reported benchmarks [2]; we do not claim to evaluate its internals. Full proofs are in Appendix 9.

2. Background and Related Work

2.1 The base protocol: VCP and its grounding limitation

VCP equips agents with commitment objects C=⟨id,d,c,α,γ,τ0,τ{exp},κ⟩—debtor d committed to creditor c to bring about consequent γ if antecedent α holds—together with a decidable content language L_C, a finite lifecycle automaton with terminal verdicts, and signed evidence for creation, acceptance, and discharge that yields non-repudiation, authority-soundness, accountability, and a decidable verdict relative to an evidence oracle [1]. Its load-bearing limitation, proved there, is that for a consequent whose truth is not a function of the protocol transcript, no protocol among mutually distrustful agents can decide Discharged vs. Breached without trusting an external attestation; VCP confines that trust to a named oracle set but cannot eliminate it [1]. EGCP is exactly an instantiation and strengthening of VCP for the setting in which a strong such oracle exists by construction.

2.2 Agentic modeling and the environment service

A line of agentic-modeling systems treats a sandboxed environment with programmatic verification as the substrate for training and evaluation. Orchard makes this explicit: Orchard Env is described as a lightweight environment service exposing reusable sandbox-lifecycle primitives across domains, harnesses, and pipeline stages, with three recipes—Orchard-SWE (coding), Orchard-GUI (computer use), Orchard-Claw (personal assistant) [2]. Such verification is not new in spirit: execution- and test-grounded evaluation is the design of SWE-bench, where a model’s patch is accepted only if previously failing tests pass [3]. What the present paper adds is the recognition that an execution-grounded environment is, formally, the trusted total oracle VCP’s impossibility result presumes unavailable, and a protocol that exploits this while remaining a conservative extension of VCP.

2.3 Credit assignment and process supervision

The temporal credit-assignment problem—attributing a delayed outcome to the decisions that caused it—is foundational [7, 8]. Recent LLM work contrasts outcome supervision (feedback on the final result) with process supervision (feedback on each step), finding process supervision superior on hard multi-step tasks but expensive because it has required human step labels [4, 5]; preference-based RL more broadly shapes agent behavior from comparison feedback [6]. EGCP contributes an automatic process signal whose per-step labels are environment-attested commitment verdicts: dense like process supervision, but cryptographically attributable and free of human annotation. Orchard’s reported credit-assignment SFT—learning from productive segments of otherwise unresolved trajectories—is precisely the heuristic EGCP gives a formal, non-repudiable basis [2].

2.4 Identity, delegation, and multi-agent risk

The personal-assistant setting makes authority carriage essential: an assistant acts for a principal under a delegated scope, as in authenticated-delegation models [11]. EGCP inherits VCP’s delegation machinery unchanged and adds nothing to the authority model; its contribution there is that an environment-grounded ledger makes the assistant’s in-scope behavior auditable after the fact. The Cooperative AI risk taxonomy names commitment problems and miscoordination among advanced agents as core failure modes [12]; an unsound environment oracle (specification gaming) is, as we show, the precise place EGCP’s residual risk concentrates.

3. Problem Setting

3.1 Environments, trajectories, oracles

Definition 1 (Environment). An environment is E=⟨W,A,step,dec⟩: world states W, actions A, a (possibly stochastic) transition step:W×A→W, and a decision procedure dec:Atoms(L_C)×W→{0,1,⊥} that, for a ground atom p and world state w, returns p’s truth value at w, or ⊥ if undecidable in E.

Definition 2 (Trajectory). A trajectory is ρ=(o0,a0,o1,a1,…,a{T−1},oT) of observations ot and actions at, generated by an agent harness interacting with E. Indices 0..T are trajectory positions.

Definition 3 (Grounding Oracle). The Grounding Oracle induced by E is OE where, for a query on atom p raised at position t, OE(p,t)=dec(p,wt) with wt the world state at t. E emits a signed attestation π=Sign{skE}(id ∥ p ∥ OE(p,t) ∥ t ∥ H(digest(wt))).

Definition 4 (Oracle informativeness order). For oracles O,O′ over LC, O≼O′ iff for every atom p and context, O(p)≠⊥⇒O′(p)=O(p) (i.e. O′ decides everything O does, agreeing where both decide). The null oracle O⊥ returns ⊥ on every non-self-evident atom; a total oracle returns no ⊥. O_⊥ is the bottom of ≼.

VCP, in the open multi-agent world, operates with a named oracle that in the worst case is O_⊥ for world facts; this is exactly the content of its grounding limitation [1].

3.2 Assumptions

Assumption 1 (Cryptography). Signatures are EUF-CMA secure; H is collision-resistant; honest parties’ keys (including the environment’s sk_E) are uncompromised unless stated. (Inherited from VCP [1, 10].)

Assumption 2 (Environment soundness, when assumed). Where stated, E is a sound oracle: whenever dec(p,w)∈{0,1} it equals the true value of p at w. Soundness is a property of E, not of EGCP; relaxing it is analyzed in Section 7.

Assumption 3 (Replayability). E exposes a digest digest(wt) such that a verifier can re-derive or re-check dec(p,wt) from the signed trajectory and environment seed. (Orchard Env’s sandbox-lifecycle primitives are the intended realization [2].)

4. The Environment-Grounded Commitment Protocol

4.1 EGCP as VCP parameterized by an oracle

Write VCP(O) for the base protocol run with evidence oracle O [1]. EGCP is the protocol that (i) instantiates O:=OE, (ii) makes E a signing participant that issues attestations π for detach/discharge queries, and (iii) anchors every commitment-lifecycle transition to a trajectory position. We write EGCP(OE) and, for the degenerate case, EGCP(O_⊥).

Definition 5 (Trajectory anchoring). A transition evidence Ek for commitment C (as in VCP) is anchored when it additionally signs the position tk and a digest of the trajectory prefix: Ek=Sign{sk{xk}}(id ∥ sk ∥ H(E{k−1}) ∥ tk ∥ H(ρ{0:tk})), where ρ{0:tk} is the trajectory prefix up to tk and x_k the acting party.

Definition 6 (Commitment-annotated trajectory). A commitment-annotated trajectory is ρ=(ρ,Γ) where ρ is a trajectory and Γ is a set of EGCP commitment records whose every transition is anchored to a position in ρ and whose detach/discharge transitions carry environment attestations π from O_E.

4.2 Lifecycle with the environment in the loop

EGCP keeps VCP’s lifecycle automaton and operations (propose/accept/detach/discharge/flag-breach/…) and adds one rule: any transition whose VCP guard is an L_C predicate over world facts (detach when α holds; discharge when γ holds; breach when ¬γ at expiry) is gated by an environment attestation π for that predicate at the transition’s anchor position. All cryptographic evidence, authority checks, and the verdict function are exactly VCP’s.

agent proposes C at position t0; VCP scope/well-formedness gate passes; record anchored EP counterpart (or training harness acting as creditor) accepts; record anchored EA; state ← Active on reaching position t where agent claims γ: query OE(γ,t); environment returns value v and signed π record anchored ED with verdict Discharged and π record anchored ED with verdict Breached and π fall back to VCP(O_⊥) behavior: verdict deferred to a named external oracle or left undetermined emit commitment-annotated trajectory ρ

4.3 Verdict-derived credit

Definition 7 (Credit functional). Given ρ=(ρ,Γ) and a segment (i,j)⊆(0,T),

J(ρ,(i,j))=​​∑_{C∈Γ}^{​} ​​w(C)⋅( the anchored discharge/breach transition of C lies in (i,j) )⋅σ(verdict(C)),

with σ(Discharged)=+1, σ(Breached)=−1, σ(other)=0, and weights w(C)≥0. J is dense (defined per segment) and attributable (nonzero only on the segment carrying the signed, environment-attested terminal transition).

This functional is the training-time payoff: a verdict that VCP makes decidable EGCP makes localizable and signed, so an RL or SFT pipeline can credit the exact rollout segment in which a sub-commitment was honored, even on a trajectory whose overall task failed.

5. Analysis

Proof sketches appear here; complete proofs in Appendix 9.

Theorem 1 (Conservative extension). For every oracle O, EGCP(O) preserves all VCP guarantees: non-repudiation, authority-soundness, accountability, and decidable verdict hold for EGCP(O) whenever they hold for VCP(O). Moreover EGCP(O⊥) is observationally equivalent to VCP(O⊥) up to the (inert) trajectory anchors.

Sketch. EGCP only adds signed fields (anchors tk,H(ρ{0:tk})) and adds a signing participant (E) whose attestations are consumed exactly where VCP already consumed oracle attestations. Anchors extend the signed message, so the VCP non-repudiation/accountability reductions go through verbatim with a longer message; authority checks are untouched; the verdict function is VCP’s. With O⊥, every world-fact gate falls to VCP’s O_⊥ branch and the anchors carry no semantic weight, giving observational equivalence. Full argument in Appendix 9.2. ◻

Theorem 2 (Zero residual under a total sound environment). If E is a sound, total oracle for LC (Assumptions 2–3), then for every commitment C exercised in E, EGCP(OE) decides verdict(C)∈{Discharged,Breached} with no residual trust beyond E itself. Equivalently, the grounding-limitation residual of VCP is zero in this setting—a guarantee VCP provably cannot attain in the open multi-agent world.

Sketch. VCP’s grounding limitation shows a transcript-only adjudicator must err across two worlds with equal transcripts but different world-fact truth. A total sound E supplies, for every relevant atom, a signed value equal to the true value (soundness) and never ⊥ (totality); the verdict function (VCP, decidable) then evaluates to a determinate terminal state. The only input beyond the transcript is E’s attestation, i.e. trust is exactly “trust E,” and no further external oracle is invoked—residual zero relative to E. That open-world VCP cannot achieve this is its Proposition on grounding [1]. Appendix 9.3. ◻

Theorem 3 (Attributable, non-repudiation-safe credit). J(ρ,(i,j))≠0 only if segment (i,j) contains an anchored discharge/breach transition ED of some C∈Γ whose verdict is supported by an environment attestation π verifying under pkE. Hence no trajectory segment can acquire credit via an unattributable or forged success claim, and credit is bound to the rollout location that produced it.

Sketch. By construction J sums only over commitments whose terminal transition is anchored into (i,j); that transition’s verdict is, by the EGCP lifecycle rule, gated by a π signed by E. Forging π or ED contradicts EUF-CMA (Assumption 1) via the VCP non-repudiation reduction; mis-anchoring (claiming a different segment) changes H(ρ{0:t_k}) and invalidates the signature. Hence nonzero credit entails a genuine, environment-attested, correctly located terminal transition. Appendix 9.4. ◻

Theorem 4 (Ledger soundness under oracle refinement (train→deploy)). Let ρ be produced under OE (training) and later inspected under a deployment oracle O{dep} with O{dep}≼OE. Then (a) every verdict environment-decided at training time remains valid and is not retroactively overturned by O{dep}; (b) any commitment left undetermined under O{dep} degrades exactly to the VCP(O_{dep}) guarantee (named-oracle or undetermined), never to an inconsistent state. Thus one signed ledger is sound across the boundary.

Sketch. ≼ requires agreement wherever both oracles decide, so a training-time {0,1} verdict cannot be contradicted by O_{dep}; it can only be left at ⊥ by a weaker deployment oracle, which is the VCP fallback branch. Signed evidence is immutable, so prior verdicts persist as non-repudiable records; monotonicity of the verdict under ≼ gives consistency. Appendix 9.5. ◻

Remark 1. Theorems 1–4 jointly say: EGCP never costs anything relative to VCP (conservative), strictly strengthens it exactly to the extent the environment is informative (zero residual under a total sound oracle), yields a training signal that cannot be gamed by unattributable claims, and produces a single ledger valid from training through deployment. The one thing it cannot do is exceed the environment’s fidelity (Section 7).

5.1 Guarantee/limitation summary

PropertyMechanismResult
No VCP guarantee lostadded signed fields onlyThm. 1
Grounding residual =0total sound E as oracleThm. 2
Credit not gameableenv-attested anchored verdictThm. 3
Train→deploy ledgeroracle refinement monotoneThm. 4
Residual riskenv unsound (spec-gaming)§7

6. Instantiation on Orchard, and a Review of the Framework

What Orchard is.

Orchard is presented as an open-source framework for scalable agentic modeling whose core, Orchard Env, is a lightweight environment service offering reusable sandbox-lifecycle primitives across task domains, agent harnesses, and pipeline stages; three recipes are built on it: Orchard-SWE (coding; trajectories distilled at scale, with credit-assignment SFT to learn from productive segments of unresolved trajectories and a balanced adaptive rollout for RL, reporting state-of-the-art SWE-bench Verified results among comparable open models), Orchard-GUI (a compact vision-language computer-use agent trained from few distilled and open-ended tasks), and Orchard-Claw (a personal-assistant agent trained from few synthetic tasks) [2]. Its thesis is that a lightweight, open, harness-agnostic environment layer makes agentic data, recipes, and evaluations reusable across domains [2].

A fair assessment.

The strengths are the harness-agnostic environment-service abstraction, strong cross-domain results from small data, and the explicit attention to credit assignment on unresolved trajectories. The gap relevant here is structural rather than empirical: Orchard’s training and scoring signals are environment-success heuristics that are neither attributable as signed objects, nor portable as a record of what the agent committed to, nor equipped with an authority model for the assistant case. EGCP is designed to slot into exactly this gap without disturbing the framework: it adds a commit/detach/discharge/attest interface above Orchard Env’s create/reset/step/terminate lifecycle, turning each recipe’s existing programmatic check into a signing Grounding Oracle.

Recipe instantiations.

  • Orchard-SWE. L_C atoms are test outcomes and patch digests; the fail-to-pass criterion that defines SWE-bench grading [3] is a sound environment oracle. Sub-commitments (“this module’s tests will pass”) get environment-attested verdicts; J gives the credit-assignment-SFT objective a non-repudiable, localized form.
  • Orchard-GUI. Atoms are reached UI/DOM states; the oracle attests whether the target state was achieved at the anchored step, yielding dense process signal for long navigation rollouts where outcome reward is sparse [4].
  • Orchard-Claw. Atoms include task-completion predicates and the principal-scope checks EGCP inherits from VCP/authenticated delegation [1, 11]; the commitment-annotated trajectory becomes a deployment-time audit trail of what the assistant promised its principal and whether it delivered, carried intact from training by Theorem 4.

7. Limitations and Future Work

Environment fidelity is the irreducible residual. EGCP’s verdicts are exactly as sound as E. If the sandbox does not reflect deployment reality, or if the agent games the specification so that the environment-attested predicate is satisfied without the intended outcome obtaining, EGCP faithfully certifies a misleading verdict. This is the EGCP analogue of VCP’s grounding limitation, relocated from “no oracle” to “oracle fidelity,” and it connects directly to specification gaming and to the multi-agent risk taxonomy [12]. EGCP does not solve specification gaming; it makes the trusted object explicit and auditable, which is the precondition for addressing it. Content-language expressiveness. As in VCP, decidability of LC trades expressiveness; predicates an environment cannot decide fall to the O⊥ branch. Reward shaping. The credit functional is intentionally minimal; optimal shaping (potential-based, risk-sensitive) and its interaction with RL stability are open. Multi-agent environments. We treat a single agent in E; commitments between two trained agents sharing an environment, and collusion through the environment, are future work. No empirical evaluation. A reference EGCP layer over an open environment service, instantiated on the three recipe families, with measured effects on credit-assignment SFT and on deployment auditability, is the natural next step; this paper provides the protocol and its guarantees.

8. Conclusion

VCP told agents how to promise; it also proved it could not, in the open world, always tell whether a promise was kept. The agentic-modeling community, meanwhile, has been building exactly the missing piece without naming it: a sandboxed environment service is the trusted, total, replayable oracle VCP’s impossibility result presumes absent. EGCP joins the two. It is a conservative extension of VCP that, parameterized by the environment oracle, strengthens VCP to zero residual where the environment is informative, turns the now-grounded verdict into a dense and attributable credit-assignment signal, and emits one signed commitment ledger that is valid from the first training rollout to the deployed assistant’s audit trail. The remaining question is no longer whether a kept promise can be recognized, but whether the environment we trust to recognize it is the world we meant.

9. Complete Proofs

9.1 Preliminaries

We reuse VCP’s model [1]: signature scheme Σ EUF-CMA secure, collision-resistant H, deterministic injective serialization, commitment record R=⟨C,E0,…,En⟩ with Ek=Sign{sk{xk}}(id ∥ sk ∥ H(E{k−1})), honest validity predicate Valid, and the decidable verdict function over the finite, acyclic-into-terminal lifecycle automaton given an oracle. EGCP replaces each Ek by its anchored form Ek=Sign{sk{xk}}(id ∥ sk ∥ H(E{k−1}) ∥ tk ∥ H(ρ{0:tk})) and admits a distinguished signer E producing attestations π=Sign{skE}(id ∥ p ∥ v ∥ t ∥ H(digest(w_t))).

9.2 Proof of Theorem 1 (Conservative extension)

Proof. We show each VCP guarantee transfers, then observational equivalence at O_⊥.

Non-repudiation and accountability. In VCP these reduce a repudiation or equivocation to an EUF-CMA forgery or hash collision on the signed message mk=id ∥ sk ∥ H(E{k−1}). EGCP signs mk′=mk ∥ tk ∥ H(ρ{0:tk}), a strict superstring. A forger against the EGCP signature is immediately a forger against the same scheme on mk′; the reduction is identical with mk′ in place of mk, and mk′≠ any honestly signed message unless H collides (negligible). E’s attestations are signed analogously and add only another honest signer, enlarging the set of attributable parties without removing attributability of any existing one. Hence non-repudiation and accountability hold for EGCP(O) whenever for VCP(O).

Authority-soundness. EGCP does not modify delegation credentials, scope predicates, or the acceptance guard; the VCP invariant “every Valid binding record has an in-scope unrevoked delegation chain” is preserved because the only added fields (t_k, prefix digest, E attestations) are not consulted by and do not weaken the scope check. The VCP proof goes through unchanged.

Decidable verdict. The verdict function is VCP’s, evaluated relative to whatever oracle is supplied; substituting O:=OE (or any O) leaves the function total and computable by VCP’s argument—finiteness and acyclicity of the automaton plus decidability of LC relative to the oracle. Anchors do not enter the verdict.

Observational equivalence at O⊥. With O⊥, every world-fact gate (detach/discharge/breach) takes the v=⊥ branch of Algorithm [alg:egcp], i.e. the VCP(O⊥) behavior. The only residual differences are the anchor fields inside signatures; these are not read by any EGCP rule when O=O⊥ and do not alter message acceptance, lifecycle transitions, or verdicts. Define the erasure map deleting tk and H(ρ{0:tk}) from each signed message; it is a bijection between EGCP(O⊥) runs and VCP(O_⊥) runs preserving all observable predicates (acceptance, state, verdict, authority). Hence the two are observationally equivalent. ◻

9.3 Proof of Theorem 2 (Zero residual)

Proof. Let E be sound (Assumption 2) and total (no ⊥ on LC atoms, Assumption 3) for the atoms occurring in C’s antecedent α and consequent γ. Consider the discharge decision. By the EGCP lifecycle rule, the discharge/breach transition is gated by OE’s signed value v on γ at the anchor position t. Totality gives v∈{0,1} (never ⊥); soundness gives v= the true value of γ at w_t. The VCP verdict function, decidable relative to an oracle that assigns truth to all relevant atoms, then evaluates C to a unique terminal state Discharged (if v=1) or Breached (if v=0); analogously for detachment via α.

It remains to show the residual trust is exactly E and nothing more. The inputs to the verdict are: the signed transcript (commitment records, anchors) and the environment attestations π. The transcript is self-certifying under Assumption 1 (VCP non-repudiation, transferred by Theorem 1). The only non-transcript input is π, signed by E. No further external oracle is queried (totality removed the ⊥ branch that would invoke one). Therefore the trust base is the singleton {E}: residual zero relative to E.

Finally, that open-world VCP cannot attain this: VCP’s grounding proposition exhibits two executions with identical transcripts but opposite truth of a transcript-independent γ, forcing any transcript-only adjudicator to err in one [1]. EGCP escapes the hypothesis precisely because π is not transcript-derived—it is a fresh signed input functionally dependent on w_t—so the dichotomy does not apply. The improvement is thus real and is exactly the value of having a sound total oracle by construction, which the open multi-agent setting lacks by assumption. ◻

9.4 Proof of Theorem 3 (Attributable credit)

Proof. Suppose J(ρ,(i,j))≠0. By the definition of J, some C∈Γ contributes a nonzero term, which requires both (a) σ(verdict(C))≠0, i.e. verdict(C)∈{Discharged,Breached}, and (b) the anchored discharge/breach transition E_D of C lies in (i,j).

For (a): by the EGCP lifecycle rule a terminal world-fact verdict is recorded only together with an environment attestation π for the gating predicate; verdict reaches Discharged/Breached only via that gated transition. So a nonzero σ implies an accompanying π.

For attribution/unforgeability: π=Sign{skE}(⋯) and ED=Sign{skd}(⋯∥tD∥H(ρ{0:tD})). Producing either without the corresponding secret key contradicts EUF-CMA (Assumption 1), by the VCP non-repudiation reduction transferred in Theorem 1. For location: ED signs H(ρ{0:tD}); placing the credited transition at a position t′≠tD (to harvest credit on a different, e.g. “productive looking,” segment) requires a valid signature over H(ρ{0:t′})≠H(ρ{0:t_D}) (distinct prefixes; equality is a hash collision, negligible), i.e. another forgery. Hence the only way to obtain nonzero credit on (i,j) is a genuine, E-attested terminal transition whose anchor truly lies in (i,j): credit is attributable, non-repudiation-safe, and correctly localized. In particular an agent cannot acquire credit by asserting success without an environment attestation, which is the reward-hacking failure mode sparse outcome reward is prone to [4]. ◻

9.5 Proof of Theorem 4 (Ledger soundness under refinement)

Proof. Let ρ be generated under OE and later evaluated under O{dep}≼O_E.

(a) No overturning. Take a commitment C with a training-time verdict verdictE(C)=Discharged (the Breached case is symmetric). This verdict was produced from OE(γ,t)=1 with signed π. Consider O{dep}(γ,t). By definition of ≼, O{dep}≼OE means wherever O{dep} decides an atom it agrees with OE where the latter decides; equivalently O{dep} cannot return a value contradicting an OE decision (it may only return ⊥). Hence O{dep}(γ,t)∈{1,⊥}, never 0: the training verdict is never flipped. The signed ED and π are immutable records (Assumption 1), so the historical verdict persists as a non-repudiable fact regardless of O{dep}.

(b) Graceful degradation. If O{dep}(γ,t)=⊥ for a commitment not terminalized at training time, the EGCP rule takes the v=⊥ branch, which is by definition the VCP(O{dep}) behavior: the verdict is whatever VCP’s named-oracle or deferred handling yields, never an inconsistent or fabricated state. Thus under O{dep} the ledger is exactly a valid VCP(O{dep}) ledger plus a set of immutable, still-valid environment-attested verdicts from training.

Consistency (monotonicity). Define verdictO as the verdict under oracle O. The above shows O≼O′ implies: for every C, if verdictO(C)∈{Discharged,Breached} then verdict{O′}(C)=verdictO(C) (decided verdicts are stable upward), and otherwise verdict{O′}(C) refines ⊥ monotonically. Hence the verdict is monotone under ≼ and a single signed ledger generated at training (large OE) remains sound when read at deployment (smaller O_{dep}), which is the claim. ◻

References

[1] [Author]. From utterances to obligations: A verifiable commitment layer for agent-to-agent interaction. Companion preprint, 2026. (Update with arXiv identifier upon posting.).

[2] B. Peng, W. Yao, Q. Wu, H. Cheng, X. Yu, R. Yang, T. Ge, A. Sordoni, X. Yuan, Y. Shen, P. He, T. Zhang, Z. Yu, and J. Gao. Orchard: An open-source agentic modeling framework. arXiv:2605.15040, 2026. Microsoft Research.

[3] C. E. Jimenez, J. Yang, A. Wettig, S. Yao, K. Pei, O. Press, and K. Narasimhan. SWE-bench: Can language models resolve real-world GitHub issues? In International Conference on Learning Representations (ICLR), 2024. arXiv:2310.06770. https://doi.org/10.48550/arXiv.2310.06770.

[4] H. Lightman, V. Kosaraju, Y. Burda, H. Edwards, B. Baker, T. Lee, J. Leike, J. Schulman, I. Sutskever, and K. Cobbe. Let’s verify step by step. arXiv:2305.20050, 2023.

[5] J. Uesato, N. Kushman, R. Kumar, F. Song, N. Siegel, L. Wang, A. Creswell, G. Irving, and I. Higgins. Solving math word problems with process- and outcome-based feedback. arXiv:2211.14275, 2022.

[6] P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and D. Amodei. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems (NeurIPS), 2017.

[7] M. Minsky. Steps toward artificial intelligence. Proceedings of the IRE, 49(1):8–30, 1961.

[8] R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 2nd edition, 2018.

[9] M. P. Singh. A social semantics for agent communication languages. In Issues in Agent Communication, LNAI 1916, pages 31–45. Springer, 2000. https://doi.org/10.1007/10722777_3.

[10] J. Zhou and D. Gollmann. A fair non-repudiation protocol. In Proceedings of the 1996 IEEE Symposium on Security and Privacy, pages 55–61, Oakland, CA, 1996.

[11] T. South, S. Marro, T. Hardjono, R. Mahari, C. D. Whitney, D. Greenwood, A. Chan, and A. Pentland. Authenticated delegation and authorized AI agents. arXiv:2501.09674, 2025.

[12] L. Hammond, A. Chan, J. Clifton, et al. Multi-agent risks from advanced AI. Cooperative AI Foundation, Technical Report #1; arXiv:2502.14143, 2025. https://doi.org/10.48550/arXiv.2502.14143.