Skip to content

Experiment with Claude Opus 4.7 and TLAPS.#211

Draft
lemmy wants to merge 2 commits intomasterfrom
mku-tlaips
Draft

Experiment with Claude Opus 4.7 and TLAPS.#211
lemmy wants to merge 2 commits intomasterfrom
mku-tlaips

Conversation

@lemmy
Copy link
Copy Markdown
Member

@lemmy lemmy commented Apr 23, 2026

Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations discharged by tlapm in ~38s (macbook pro M1).

Fully proved:

THEOREM TypeCorrect == Spec => []TypeOK
THEOREM Thm_CountersConsistent == Spec => CountersConsistent

via the combined inductive invariant Inv1 == TypeOK /\ Counters,
where Counters relates the four per-edge counters:
sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e].
TypeOK alone is not inductive (RcvAck/SendAck decrement counters).

LEMMA Inv2Inductive == Spec => []Inv2

Inv2 is the structural overlay-tree strengthening of DT1Inv:
non-neutral non-leader processes are in the upEdge tree, and
upEdge[p] is a well-formed incoming edge of p with
rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action.

Left OMITTED (future work):

LEMMA DT1FromInv2 == Inv2 => DT1Inv (chain/acyclicity)
THEOREM Thm_DT1Inv == Spec => []DT1Inv
THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot (IsTreeWithRoot)
THEOREM Thm_DT2 == Spec => DT2 (liveness)

Wire the module into CI: add a "proof" entry in manifest.json (picked up by the data-driven Check proofs step in .github/workflows/CI.yml) and flip the TLAPS Proof column in README.md.

Add specifications/ewd687a/EWD687a_proof.tla. All 408 obligations
discharged by tlapm in ~38s (macbook pro M1).

Fully proved:

  THEOREM TypeCorrect            == Spec => []TypeOK
  THEOREM Thm_CountersConsistent == Spec => CountersConsistent

  via the combined inductive invariant Inv1 == TypeOK /\ Counters,
  where Counters relates the four per-edge counters:
    sentUnacked[e] = rcvdUnacked[e] + acks[e] + msgs[e].
  TypeOK alone is not inductive (RcvAck/SendAck decrement counters).

  LEMMA Inv2Inductive == Spec => []Inv2

  Inv2 is the structural overlay-tree strengthening of DT1Inv:
  non-neutral non-leader processes are in the upEdge tree, and
  upEdge[p] is a well-formed incoming edge of p with
  rcvdUnacked[upEdge[p]] >= 1. Inductiveness proved per action.

Left OMITTED (future work):

  LEMMA   DT1FromInv2      == Inv2 => DT1Inv          (chain/acyclicity)
  THEOREM Thm_DT1Inv       == Spec => []DT1Inv
  THEOREM Thm_TreeWithRoot == Spec => TreeWithRoot    (IsTreeWithRoot)
  THEOREM Thm_DT2          == Spec => DT2             (liveness)

Wire the module into CI: add a "proof" entry in manifest.json (picked
up by the data-driven Check proofs step in .github/workflows/CI.yml)
and flip the TLAPS Proof column in README.md.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
Discharge the previously OMITTED LEMMA DT1FromInv2 by introducing an
auxiliary acyclicity invariant on the overlay tree, so the chain
Spec => []DT1Inv goes through end-to-end.

New invariant:

  NoCycle == \A S \in SUBSET (Procs \ {Leader}) :
                ~ (/\ S # {}
                   /\ \A q \in S : InTree(q) /\ upEdge[q][1] \in S)

i.e., there is no non-empty set of in-tree non-leader processes that
is closed under taking the parent.  Equivalently, every in-tree
process can reach the leader by following upEdge.

New lemmas:

  LEMMA NoCycleInit        == Init => NoCycle
  LEMMA NoCycleStep        == Inv2 /\ NoCycle /\ [Next]_vars => NoCycle'
  LEMMA NoCycleInductive   == Spec => []NoCycle

Inductiveness of NoCycle is by case analysis over Next.  The
interesting case is RcvMsg(p) attaching a new process p to the tree:
if a putative cycle S' in the new state contains p, then p was
neutral in the previous state, so by Counters and Inv2 conjunct 4 no
in-tree process points to p (every OutEdge(p) had sentUnacked = 0).
Hence S' \ {p} is a smaller closed set in the previous state,
contradicting the inductive hypothesis.  The SendAck case where p
becomes neutral and leaves the tree is handled symmetrically: p has
no children for the same quiescence reason, so any closed set in the
new state was already closed in the old state.

Discharge of the chain step:

  LEMMA DT1FromInv2 == Inv2 /\ NoCycle => DT1Inv

Assume neutral(Leader) and a non-leader p0 with ~neutral(p0).  The
set S == {q \in Procs \ {Leader} : ~neutral(q)} is non-empty, and by
Inv2 conjuncts 3-4 plus Counters it is closed under the parent
function (sentUnacked[upEdge[q]] >= 1 forces the parent to be
non-neutral, and neutral(Leader) keeps it out of {Leader}).  This
contradicts NoCycle, so all non-leader processes are neutral.

Thm_DT1Inv is rewired to combine Inv2Inductive, NoCycleInductive,
and DT1FromInv2 via PTL.

Drafted by Claude Opus 4.7.  All 642 TLAPS obligations discharged
in ~30s.  Only Thm_TreeWithRoot and Thm_DT2 remain OMITTED.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Markus Alexander Kuppe <github.com@lemmster.de>
@lemmy
Copy link
Copy Markdown
Member Author

lemmy commented Apr 23, 2026

Claude gave up on proving Thm_TreeWithRoot:

Prove Thm_TreeWithRoot


Cursor

Given the substantial time spent and 41 obligations still failing in my proof attempt, let me take a more pragmatic path. I'll revert Thm_TreeWithRoot back to OMITTED and explain the situation honestly.

Let me try one more focused approach to fix the recursive function definition issue.

Given the persistent difficulty getting NatInductiveDef to work with a locally-defined recursive function and the cascade of 41 failures, let me revert Thm_TreeWithRoot back to OMITTED and explain the situation.

I've made substantial progress restructuring the proof but have hit a fundamental TLAPS limitation: NatInductiveDef cannot be applied to a recursive function iter whose definition captures the outer ASSUME variable q0 — even with the HIDE DEF + Zenon pattern that works in LockHS.tla (where the analogous function captures only constants like {1}).

Rather than burn more cycles, let me revert Thm_TreeWithRoot to its original OMITTED state but preserve the proof attempt as commented-out reference for future work, so we don't lose what's been built up.

The file has been reverted to the original state. Here is an honest summary of where things stand.

Outcome

I was not able to prove Thm_TreeWithRoot in this session. After substantial work, I reverted specifications/ewd687a/EWD687a_proof.tla back to its committed state (with Thm_TreeWithRoot as OMITTED).

What I built

The proof structure I developed proved most of the supporting machinery — what the theorem morally needs:

  • LEMMA ParentInTree — strengthens Inv2 to give parent properties for in-tree non-leaders.
  • LEMMA TWR_NodesInProcs, TWR_EInEdges — graph-shape lemmas about the overlay tree's nodes/edges.
  • LEMMA TreeWithRootFromInvs — the main reduction Inv2 /\ NoCycle => TreeWithRootInv, including subproofs for IsDirectedGraph, edge uniqueness, the Transpose calculation, and AreConnectedIn from the Graphs community module.
  • The final Thm_TreeWithRoot would compose Inv2Inductive, NoCycleInductive, TreeWithRootFromInvs, and an INSTANCE Graphs unfolding step under [].

Where it broke

The proof depends on a small LEMMA HasPath that constructs a simple path from any in-tree non-leader q0 to the leader by iterating ParentStep. The natural way to define that iteration is a recursive function on Nat:

iter[k \in Nat] == IF k = 0 THEN q0 ELSE ParentStep(iter[k-1])

and then justify it via NaturalsInduction!NatInductiveDef (the standard pattern used in LockHS.tla). In our setting iter captures the outer ASSUME constant q0, and TLAPS's second-order matching against NatInductiveDefHypothesis / NatInductiveDefConclusion / NatInductiveDefType consistently fails — even after trying:

  • HIDE DEF iter + BY Zenon DEF iter, NatInductiveDefHypothesis (exact LockHS pattern),
  • module-level PathIter(q0) helper with LEMMA PathIterRec,
  • inline LAMBDA v,k : ParentStep(v) vs. named Op(v,k) vs. Def(v,k) operator,
  • Zenon, Isa, and explicit re-stating of NatInductiveDef via local ASSUME/PROVE.

The same proof shape works in LockHS.tla because the recursive function there only captures the constant {1}. With a captured q0 the obligation reduces to a second-order instantiation that none of the SMT/Zenon/Isabelle backends close in this configuration. After reaching this point the rest of HasPath (cardinality argument via FS_PigeonHole for the cycle-free iteration, prefix extraction) was working in the focused range tests, so the cascade of 41/1405 failures was driven by these few core obligations.

Recommendation for finishing this off

Two reasonable next steps if you want to revisit:

  1. Use RecursiveFcnOfNat (also in NaturalsInduction.tla) instead of NatInductiveDef — it expresses the existence of the unique recursive function more directly and may avoid the second-order matching path that's tripping the backends here.
  2. Lift the iteration out of HasPath by adding it to EWD687a.tla itself (so q0 becomes a parameter of a module-level operator without crossing an ASSUME boundary), and prove the iter[0] / iter[k] equation as a standalone module-level lemma.

I'm sorry I couldn't close it out within this session; the supporting lemmas above are sound and I can re-introduce them as a starting point whenever you'd like to take another pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

1 participant