Experiment with Claude Opus 4.7 and TLAPS. by lemmy · Pull Request #211 · tlaplus/Examples

lemmy · 2026-04-23T00:38:10Z

Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations discharged by tlapm in ~38s (macbook pro M1).

Fully proved:

THEOREM TypeCorrect == Spec => []TypeOK
THEOREM Thm_CountersConsistent == Spec => CountersConsistent

via the combined inductive invariant Inv1 == TypeOK /\ Counters,
where Counters relates the four per-edge counters:
sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e].
TypeOK alone is not inductive (RcvAck/SendAck decrement counters).

LEMMA Inv2Inductive == Spec => []Inv2

Inv2 is the structural overlay-tree strengthening of DT1Inv:
non-neutral non-leader processes are in the upEdge tree, and
upEdge[p] is a well-formed incoming edge of p with
rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action.

Left OMITTED (future work):

LEMMA DT1FromInv2 == Inv2 => DT1Inv (chain/acyclicity)
THEOREM Thm_DT1Inv == Spec => []DT1Inv
THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot (IsTreeWithRoot)
THEOREM Thm_DT2 == Spec => DT2 (liveness)

Wire the module into CI: add a "proof" entry in manifest.json (picked up by the data-driven Check proofs step in .github/workflows/CI.yml) and flip the TLAPS Proof column in README.md.

Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations discharged by tlapm in ~38s (macbook pro M1). Fully proved: THEOREM TypeCorrect == Spec => []TypeOK THEOREM Thm_CountersConsistent == Spec => CountersConsistent via the combined inductive invariant Inv1 == TypeOK /\ Counters, where Counters relates the four per-edge counters: sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e]. TypeOK alone is not inductive (RcvAck/SendAck decrement counters). LEMMA Inv2Inductive == Spec => []Inv2 Inv2 is the structural overlay-tree strengthening of DT1Inv: non-neutral non-leader processes are in the upEdge tree, and upEdge[p] is a well-formed incoming edge of p with rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action. Left OMITTED (future work): LEMMA DT1FromInv2 == Inv2 => DT1Inv (chain/acyclicity) THEOREM Thm_DT1Inv == Spec => []DT1Inv THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot (IsTreeWithRoot) THEOREM Thm_DT2 == Spec => DT2 (liveness) Wire the module into CI: add a "proof" entry in manifest.json (picked up by the data-driven Check proofs step in .github/workflows/CI.yml) and flip the TLAPS Proof column in README.md. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

Discharge the previously OMITTED LEMMA DT1FromInv2 by introducing an auxiliary acyclicity invariant on the overlay tree, so the chain Spec => []DT1Inv goes through end-to-end. New invariant: NoCycle == \A S \in SUBSET (Procs \ {Leader}) : ~ (/\ S # {} /\ \A q \in S : InTree(q) /\ upEdge[q][1] \in S) i.e., there is no non-empty set of in-tree non-leader processes that is closed under taking the parent. Equivalently, every in-tree process can reach the leader by following upEdge. New lemmas: LEMMA NoCycleInit == Init => NoCycle LEMMA NoCycleStep == Inv2 /\ NoCycle /\ [Next]_vars => NoCycle' LEMMA NoCycleInductive == Spec => []NoCycle Inductiveness of NoCycle is by case analysis over Next. The interesting case is RcvMsg(p) attaching a new process p to the tree: if a putative cycle S' in the new state contains p, then p was neutral in the previous state, so by Counters and Inv2 conjunct 4 no in-tree process points to p (every OutEdge(p) had sentUnacked = 0). Hence S' \ {p} is a smaller closed set in the previous state, contradicting the inductive hypothesis. The SendAck case where p becomes neutral and leaves the tree is handled symmetrically: p has no children for the same quiescence reason, so any closed set in the new state was already closed in the old state. Discharge of the chain step: LEMMA DT1FromInv2 == Inv2 /\ NoCycle => DT1Inv Assume neutral(Leader) and a non-leader p0 with ~neutral(p0). The set S == {q \in Procs \ {Leader} : ~neutral(q)} is non-empty, and by Inv2 conjuncts 3-4 plus Counters it is closed under the parent function (sentUnacked[upEdge[q]] >= 1 forces the parent to be non-neutral, and neutral(Leader) keeps it out of {Leader}). This contradicts NoCycle, so all non-leader processes are neutral. Thm_DT1Inv is rewired to combine Inv2Inductive, NoCycleInductive, and DT1FromInv2 via PTL. Drafted by Claude Opus 4.7. All 642 TLAPS obligations discharged in ~30s. Only Thm_TreeWithRoot and Thm_DT2 remain OMITTED. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>

lemmy · 2026-04-23T22:00:19Z

Claude gave up on proving Thm_TreeWithRoot:

Prove Thm_TreeWithRoot

Cursor

Given the substantial time spent and 41 obligations still failing in my proof attempt, let me take a more pragmatic path. I'll revert Thm_TreeWithRoot back to OMITTED and explain the situation honestly.

Let me try one more focused approach to fix the recursive function definition issue.

Given the persistent difficulty getting NatInductiveDef to work with a locally-defined recursive function and the cascade of 41 failures, let me revert Thm_TreeWithRoot back to OMITTED and explain the situation.

I've made substantial progress restructuring the proof but have hit a fundamental TLAPS limitation: NatInductiveDef cannot be applied to a recursive function iter whose definition captures the outer ASSUME variable q0 — even with the HIDE DEF + Zenon pattern that works in LockHS.tla (where the analogous function captures only constants like {1}).

Rather than burn more cycles, let me revert Thm_TreeWithRoot to its original OMITTED state but preserve the proof attempt as commented-out reference for future work, so we don't lose what's been built up.

The file has been reverted to the original state. Here is an honest summary of where things stand.

Outcome

I was not able to prove Thm_TreeWithRoot in this session. After substantial work, I reverted specifications/ewd687a/EWD687a_proof.tla back to its committed state (with Thm_TreeWithRoot as OMITTED).

What I built

The proof structure I developed proved most of the supporting machinery — what the theorem morally needs:

LEMMA ParentInTree — strengthens Inv2 to give parent properties for in-tree non-leaders.
LEMMA TWR_NodesInProcs, TWR_EInEdges — graph-shape lemmas about the overlay tree's nodes/edges.
LEMMA TreeWithRootFromInvs — the main reduction Inv2 /\ NoCycle => TreeWithRootInv, including subproofs for IsDirectedGraph, edge uniqueness, the Transpose calculation, and AreConnectedIn from the Graphs community module.
The final Thm_TreeWithRoot would compose Inv2Inductive, NoCycleInductive, TreeWithRootFromInvs, and an INSTANCE Graphs unfolding step under [].

Where it broke

The proof depends on a small LEMMA HasPath that constructs a simple path from any in-tree non-leader q0 to the leader by iterating ParentStep. The natural way to define that iteration is a recursive function on Nat:

iter[k \in Nat] == IF k = 0 THEN q0 ELSE ParentStep(iter[k-1])

and then justify it via NaturalsInduction!NatInductiveDef (the standard pattern used in LockHS.tla). In our setting iter captures the outer ASSUME constant q0, and TLAPS's second-order matching against NatInductiveDefHypothesis / NatInductiveDefConclusion / NatInductiveDefType consistently fails — even after trying:

HIDE DEF iter + BY Zenon DEF iter, NatInductiveDefHypothesis (exact LockHS pattern),
module-level PathIter(q0) helper with LEMMA PathIterRec,
inline LAMBDA v,k : ParentStep(v) vs. named Op(v,k) vs. Def(v,k) operator,
Zenon, Isa, and explicit re-stating of NatInductiveDef via local ASSUME/PROVE.

The same proof shape works in LockHS.tla because the recursive function there only captures the constant {1}. With a captured q0 the obligation reduces to a second-order instantiation that none of the SMT/Zenon/Isabelle backends close in this configuration. After reaching this point the rest of HasPath (cardinality argument via FS_PigeonHole for the cycle-free iteration, prefix extraction) was working in the focused range tests, so the cascade of 41/1405 failures was driven by these few core obligations.

Recommendation for finishing this off

Two reasonable next steps if you want to revisit:

Use RecursiveFcnOfNat (also in NaturalsInduction.tla) instead of NatInductiveDef — it expresses the existence of the unique recursive function more directly and may avoid the second-order matching path that's tripping the backends here.
Lift the iteration out of HasPath by adding it to EWD687a.tla itself (so q0 becomes a parameter of a module-level operator without crossing an ASSUME boundary), and prove the iter[0] / iter[k] equation as a standalone module-level lemma.

I'm sorry I couldn't close it out within this session; the supporting lemmas above are sound and I can re-introduce them as a starting point whenever you'd like to take another pass.

lemmy added the enhancement label Apr 23, 2026

lemmy force-pushed the mku-tlaips branch from 188777f to 45da2e0 Compare April 23, 2026 01:08

lemmy requested a review from muenchnerkindl April 23, 2026 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment with Claude Opus 4.7 and TLAPS.#211

Experiment with Claude Opus 4.7 and TLAPS.#211
lemmy wants to merge 2 commits intomasterfrom
mku-tlaips

lemmy commented Apr 23, 2026

Uh oh!

lemmy commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

lemmy commented Apr 23, 2026

Uh oh!

lemmy commented Apr 23, 2026

Outcome

What I built

Where it broke

Recommendation for finishing this off

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant