You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Cloud Agent currently treats a completed assistant message.updated event as successful user-turn completion, but Kilo emits those events for intermediate tool-loop steps. That can pin kg agent result, callbacks, and durable session state to an early tool-call response while the agent continues working. The success boundary needs to live with the wrapper that observes Kilo's run lifecycle and post-processing, while the Durable Object remains a durable coordinator rather than reconstructing prompt lifecycle from transient events.
What was done
Keep successful assistant updates as transcript/activity events and reserve prompt failure for explicit assistant errors.
Have the wrapper seal the exact admitted batch after three seconds of stable root idle, block new admission while finalizing, run post-processing, and emit one authoritative complete with sealed message IDs.
Cancel stable-idle finalization only when the root session explicitly resumes via session.turn.open or busy session.status; trailing post-idle events no longer prevent completion.
Persist one run-level finalizing hold in the Durable Object so follow-up work stays pending without retry churn, then settle exact sealed membership and drain held work under a fresh run.
Preserve rolling-deployment compatibility by accepting legacy complete payloads without membership for legacy runs while wrapper version 2.3.0 moves fresh work onto the sealed-batch protocol.
High-level architecture
sequenceDiagram
participant DO as Session Durable Object
participant Runtime as Agent Runtime
participant Wrapper
participant Kilo
participant Ingest as Ingest Worker
DO->>Runtime: Deliver fenced pending message
Runtime->>Wrapper: Submit prompt or command
Wrapper->>Kilo: Admit work
Kilo-->>Wrapper: Assistant steps and root idle
Note over Wrapper: Stable root idle seals exact admitted membership
Wrapper->>Ingest: wrapper_finalizing(wrapperRunId)
Ingest->>DO: Hold new delivery for current run
Wrapper->>Wrapper: Run post-processing
Wrapper->>Ingest: complete(sealed messageIds)
Ingest->>DO: Settle exact batch and retire run
DO->>Runtime: Drain held work under a fresh run
Loading
Architecture decision
Decision: Make fenced wrapper complete with exact sealed batch membership the only normal success boundary, backed by one Durable Object run-level finalizing hold.
Context: Completed assistant messages are model-step boundaries, root idle can be transient between queued prompts, and the Durable Object has no bounded authoritative Kilo query that can reconstruct a prompt's final success after the wrapper is gone.
Rationale: The wrapper directly observes Kilo lifecycle and owns post-processing, so it can seal the admitted run batch without guessing. Exact membership lets the Durable Object settle durable message state while staying focused on coordination, fencing, pending delivery, and failure supervision.
Alternatives considered:
Terminalize on completed assistant updates. This is the source of premature tool-loop completion because one user turn can produce multiple completed assistant steps.
Infer success from raw root idle in the Durable Object. Root idle is transient and would duplicate wrapper lifecycle logic without proving exact batch membership.
Add complete acknowledgements or a terminal outbox. This would add protocol and recovery complexity beyond the current failure-safe requirement; lost complete remains a conservative wrapper failure through existing supervision.
Consequences: Follow-up messages arriving during finalization wait durably for a fresh wrapper run, and rolling deployment retains a narrow legacy-complete fallback. The protocol intentionally favors conservative failure over inferring success when an authoritative complete is lost.
Verification
Ran a fake-LLM cold turn and observed terminal complete.
Queued two follow-up turns while busy and observed FIFO completion across the sealed batch boundary.
Ran fake-LLM chunked-streaming and empty-response paths and observed terminal complete.
Ran fake-LLM waiter cleanup and observed zero leaked waiters and live responses.
Visual Changes
N/A
Reviewer Notes
Review the wrapper stable-idle/admission fence together with Durable Object exact-membership settlement and run-level finalizing hold; these form one lifecycle contract.
Legacy complete without messageIds is accepted only for rolling compatibility; fresh work is pushed to wrapper protocol 2.3.0.
The default fake-LLM smoke matrix excludes callback scenarios because callbackTarget uses the internal legacy API; focused callback scenarios remain available manually.
The focused external-kill fake-LLM scenario still fails independently with no reconnect/terminal event and is not addressed by this PR.
Incremental review of the new commit finds no bugs, security issues, or logic errors across the full sealed-batch protocol — wrapper state machine, DO coordination, ingest routing, and integration tests all look correct.
Key Design Decisions Verified (informational)
isDeliveryHeld vs isWrapperDeliveryHeld asymmetry (CloudAgentSession.ts:605): The isDeliveryHeld callback passed to createSessionMessageQueue only checks isWrapperRunFinalizing, not lease.state. Physical cleanup blocking is handled by the pre-existing getDeliveryBlock path. isDeliveryHeld is exclusively for the WRAPPER_FINALIZING hold-without-retry semantics. Alarm scheduling correctly uses the full isWrapperDeliveryHeld (both checks). Confirmed by integration test "holds pending delivery through physical cleanup and drains after confirmed absence".
_admissionsBlocked lifecycle (wrapper/src/state.ts): Only cleared in bindSession() when !this.session. clearAllMessages() does not reset it. bindSessionContext detects isFreshRunAfterFinalization and allows a fresh bind which resets it. Validated by "accepts a fresh wrapper run after finalizing clears its session".
stableIdleTimer race (wrapper/src/lifecycle.ts): trySealIdleBatch sets stableIdleTimer = null before any checks, preventing double-arm. stop() calls clearStableIdleCandidate() — no leaks.
SQL query removal of time.completed constraint (session/queries/events.ts): Intentional. Settlement authority moves to settleSealedBatch which uses the sealed messageIds list. Tool-loop fixture tests in tool-loop-terminalization.test.ts validate end-to-end correctness.
settleSealedBatch null return: Only when getMetadata() is null. onTerminalEvent handles this conservatively by stopping and failing all accepted messages.
markWrapperFinalizing read+write: Safe under DO single-threaded execution guarantee, with run-ID guard.
eshurakov
changed the title
fix(cloud-agent-next): settle tool loops on sealed wrapper batches
fix(cloud-agent-next): wait for tool loops to finish
Jun 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why
Cloud Agent currently treats a completed assistant
message.updatedevent as successful user-turn completion, but Kilo emits those events for intermediate tool-loop steps. That can pinkg agent result, callbacks, and durable session state to an early tool-call response while the agent continues working. The success boundary needs to live with the wrapper that observes Kilo's run lifecycle and post-processing, while the Durable Object remains a durable coordinator rather than reconstructing prompt lifecycle from transient events.What was done
completewith sealed message IDs.session.turn.openor busysession.status; trailing post-idle events no longer prevent completion.completepayloads without membership for legacy runs while wrapper version2.3.0moves fresh work onto the sealed-batch protocol.High-level architecture
Architecture decision
Decision: Make fenced wrapper
completewith exact sealed batch membership the only normal success boundary, backed by one Durable Object run-level finalizing hold.Context: Completed assistant messages are model-step boundaries, root idle can be transient between queued prompts, and the Durable Object has no bounded authoritative Kilo query that can reconstruct a prompt's final success after the wrapper is gone.
Rationale: The wrapper directly observes Kilo lifecycle and owns post-processing, so it can seal the admitted run batch without guessing. Exact membership lets the Durable Object settle durable message state while staying focused on coordination, fencing, pending delivery, and failure supervision.
Alternatives considered:
completeremains a conservative wrapper failure through existing supervision.Consequences: Follow-up messages arriving during finalization wait durably for a fresh wrapper run, and rolling deployment retains a narrow legacy-complete fallback. The protocol intentionally favors conservative failure over inferring success when an authoritative
completeis lost.Verification
complete.complete.Visual Changes
N/A
Reviewer Notes
completewithoutmessageIdsis accepted only for rolling compatibility; fresh work is pushed to wrapper protocol2.3.0.callbackTargetuses the internal legacy API; focused callback scenarios remain available manually.external-killfake-LLM scenario still fails independently with no reconnect/terminal event and is not addressed by this PR.