Skip to content

fix: don't cache failed rustc-wrapper output in rustc info cache#17000

Open
coleleavitt wants to merge 1 commit into
rust-lang:masterfrom
coleleavitt:fix/dont-cache-rustc-info-failures
Open

fix: don't cache failed rustc-wrapper output in rustc info cache#17000
coleleavitt wants to merge 1 commit into
rust-lang:masterfrom
coleleavitt:fix/dont-cache-rustc-info-failures

Conversation

@coleleavitt
Copy link
Copy Markdown

Summary

When a rustc_wrapper (e.g., sccache) fails transiently, the failure output gets cached in target/.rustc_info.json and is replayed on every subsequent cargo build. Since the cache fingerprint is derived from binary mtime/size and toolchain metadata — not runtime state — the poisoned cache entry is never invalidated, permanently breaking builds until the user manually deletes the file.

The Bug

Cache::cached_output() unconditionally inserts command output into the cache regardless of exit status. On a cache hit, it replays whatever was stored — including failures. The fingerprint only changes when the rustc/wrapper binary or toolchain changes, so a transient failure (server timeout, permission error, etc.) becomes permanent.

Real-world trigger: sccache server occasionally returns EPERM on startup race conditions. One failure poisons the cache and every build afterward fails with the same error — even after sccache recovers.

The Fix

  • Only cache successful command outputs. Failures are returned as errors but never persisted.
  • On cache load, if a previously-cached failure is found, remove it and retry the command.
  • Skip writing the cache file on Drop when there are no outputs to persist (avoids writing empty/useless cache files).

Testing

  • Updated rustc_info_cache_with_wrappers to verify failures are retried rather than replayed from cache.
  • Added rustc_info_cache_transient_failure_recovery test that simulates a wrapper failing then recovering, verifying the build succeeds on retry without manual intervention.

When a rustc wrapper (e.g., sccache) fails transiently, the failure
output was cached in target/.rustc_info.json and replayed on every
subsequent build. Since the cache fingerprint is based on binary
mtime/size and toolchain metadata — not runtime state — the cached
failure was never invalidated, requiring manual deletion of the file.

This changes cached_output() to only persist successful command results.
Failed outputs are returned as errors but not inserted into the cache,
allowing subsequent builds to retry the command and succeed once the
transient issue resolves.

Also skip writing the cache file on Drop when there are no cached
outputs to persist.
@rustbot rustbot added A-cache-messages Area: caching of compiler messages S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 15, 2026
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 15, 2026

r? @epage

rustbot has assigned @epage.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @ehuss, @epage, @weihanglo
  • @ehuss, @epage, @weihanglo expanded to ehuss, epage, weihanglo
  • Random selection from ehuss, epage, weihanglo

@epage
Copy link
Copy Markdown
Contributor

epage commented May 15, 2026

This appears to be an attempt to fix #15358, is that right?

If that is the case then this isn't as straightforward as "this shouldn't be done".

Either way, the next step is to bring your use case to that issue or a new issue as our contrib docs ask for us to work on understanding the problem and potential solutions as well as to get buy-in for a solution before moving onto a PR.

@coleleavitt
Copy link
Copy Markdown
Author

You're right — apologies for putting the PR before the design discussion. I've now posted a detailed write-up on #15358 with my concrete use case (sccache transient failures permanently poisoning target/.rustc_info.json) and laid out the two viable approaches (don't cache failures vs. cache-but-don't-use-cached-failures, which is what @tomoverlund proposed in 1a93dc4 above this thread).

Happy to leave this PR open as a reference implementation while #15358 gets triaged, or close it and re-open once/if the issue lands S-accepted — whichever you prefer.

Either way I'll wait for buy-in on direction before pushing anything else here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-cache-messages Area: caching of compiler messages S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants