ai: contain Kiro ACP stream failures#217
Open
anders-heimer wants to merge 5 commits into
Open
Conversation
1a27100 to
190d02c
Compare
Capture malformed ACP stdout separately from stderr so errors can report both diagnostic streams without leaking secrets. Include redacted JSON-RPC error data in Kiro ACP failures to expose nested provider error details. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Anders Heimer <anders.heimer@est.tech>
Kiro ACP wraps response-stream and provider failures in JSON-RPC -32603 errors with details in error.data. Classify known transient, rate-limit, provider, and permanent markers so the retry layer can back off appropriately. Treat kiro-cli timeouts as transient typed errors. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Anders Heimer <anders.heimer@est.tech>
Track whether ACP session updates indicate possible tool, command, or file mutations before an error is returned. Retryable-looking failures after such updates are kept fatal so the caller does not replay potentially non-idempotent work. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Anders Heimer <anders.heimer@est.tech>
Carry budget-exceeded terminal markers through the review-bin AI stdio protocol so true budget failures keep fail-fast behavior after crossing the process boundary. This is shared protocol plumbing used by the Kiro ACP containment path. Signed-off-by: Anders Heimer <anders.heimer@est.tech>
Bound Kiro ACP stream failures with Kiro-only per-turn retry budgets, a process-local circuit breaker, and ACP-line idle telemetry. Send session/cancel on prompt failures and keep retries blocked after side-effect-looking ACP updates. Rate-limit markers intentionally take precedence over generic stream failure markers so throttling receives the slower quota backoff instead of the short stream retry delay. Signed-off-by: Anders Heimer <anders.heimer@est.tech>
190d02c to
3a6bd12
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Kiro ACP
session/promptis a long-lived stream. When that stream fails,Sashiko currently retries the same expensive turn without enough
provider-specific containment.
This PR adds Kiro-specific handling for those failures:
idle watchdogs, and a process-local circuit breaker
Patch Layout
ai: improve kiro acp diagnosticsCapture malformed ACP stdout separately from stderr and surface
redacted JSON-RPC
error.data.ai: classify transient kiro acp errorsClassify known Kiro stream failures, throttling, provider availability
failures, and permanent auth/configuration errors.
ai: block kiro retries after side effectsTreat retryable-looking failures as fatal once ACP updates suggest a
possible tool, command, or file mutation.
ai: preserve terminal budget errors over stdioCarry fail-fast budget errors through the review-bin AI stdio protocol
so subprocess boundaries do not turn them into ordinary retryable
remote errors.
ai: contain kiro acp stream failuresAdd Kiro-only retry budgets, same-error streak caps, turn wall-clock
limits, ACP idle telemetry, prompt cancellation, request-id extraction,
and a process-local circuit breaker.
Notes
The only shared runtime plumbing is the review-bin terminal error marker.
It is generic, but included here because Kiro containment needs terminal
provider failures to survive stdio serialization.
This PR does not include follow-up handling for empty successful Kiro
responses, Kiro response parser quirks, or broader CLI diagnostic logging.