Speed up CI test runs (parallelism + caching)#9636
Open
camd wants to merge 2 commits into
Open
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #9636 +/- ##
========================================
Coverage 82.97% 82.97%
========================================
Files 618 618
Lines 35798 35798
Branches 3273 3216 -57
========================================
Hits 29705 29705
- Misses 5723 5943 +220
+ Partials 370 150 -220 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduce wall-clock time of the CI unit-test jobs by parallelizing execution, giving the heavy jobs more cores, and trimming repeated setup cost. No test files are modified — changes are limited to CI config and test-runner settings.
Measured results
Overall CI wall-clock is gated by the slowest parallel job (
python-tests-general): 925s → 416s, −55%, ~8.5 min saved per run. All jobs pass. Master numbers are medians over recent runs (low variance).* the old JS job bundled lint + markdownlint + tests serially.
Cost note: a
largemachine bills ~2× credits/minute, but each job now runs in roughly half the wall-clock — so vs today's master the 4-vCPU jobs are both faster and slightly cheaper in total credits.Changes
JavaScript (
javascript-tests)resource_class: large) and let Jest's worker pool balance files dynamically (--maxWorkers=4). Container sharding was measured and rejected: it partitions files statically by path (18.5s vs 5.1s imbalance when the two heavy perfherder files collided on one shard) and re-pays the ~40spnpm installper container, for a suite whose tests total ~41s. One bigger box wins on both wall-clock and cost, and Jest's in-process pool already balances dynamically.javascript-lintjob so test results are no longer gated behindpnpm lint+markdownlint; the two now run concurrently.Python (
python-tests-*)pytest-xdistand run each of the four marker jobs with-n auto --dist load(multi-process instead of serial, no extra Docker-stack startups).resource_class: large(4 vCPU) so-n autogets more workers — the default machine class is 2 vCPU.--dist load(notloadscope): profiling the two biggest general-marker files showedloadscopepinned the 52stest_perf_data_adapters.pyto one worker (53.7s on-n 4), whileloadspread its 32 independent parametrized tests across workers for ~2× (26.3s), identical pass counts. An audit confirmed every module/session-scoped fixture intests/is read-only setup, so per-worker re-creation underloadis safe.pytest-djangocreates a separate test database per worker (test_treeherder_gw0, …), so fixtures that assume specific row IDs remain valid.Caching
setup-pnpmcommand caches the pnpm store (keyed onpnpm-lock.yaml).install-toxcommand caches thepip/toxinstall across the four Python jobs.Verification
pytest -n auto --dist loadpass counts match the serial run with per-worker databases created as expected.