[Perf] MATERIALIZE my_follows CTE in GetTracks/GetPlaylists#792
Merged
raymondjacobson merged 1 commit intomainfrom May 8, 2026
Merged
[Perf] MATERIALIZE my_follows CTE in GetTracks/GetPlaylists#792raymondjacobson merged 1 commit intomainfrom
raymondjacobson merged 1 commit intomainfrom
Conversation
Without MATERIALIZED, Postgres inlines the CTE and re-runs the follows JOIN aggregate_user + ORDER BY follower_count once per SubPlan invocation in the followee_reposts/followee_favorites projections. For users with many follows this dominates the query. Verified on the production read replica with user 20 (1752 follows) and a 10-track id list: Before: 368 ms exec, 86,213 shared buffer hits After: 80 ms exec, 45,298 shared buffer hits (~4.6x) GetTracks runs ~268M times in pg_stat_statements (the most-called personalization query in the API), so the savings compound. Adds a regression test covering has_current_user_*, followee_reposts, and followee_favorites since none existed before.
3 tasks
raymondjacobson
added a commit
that referenced
this pull request
May 8, 2026
## Summary Lower `LIMIT 5000` to `LIMIT 200` in the `my_follows` CTE used by `GetTracks` and `GetPlaylists`. The CTE feeds two consumers (`followee_reposts`, `followee_favorites`) that each emit at most 3 rows ordered by `follower_count DESC`, so 5000 was almost always materializing more than the LATERAL ever consumed. ## Impact Verified on the prod read replica with user 20 (1752 follows), 10-track id list, three warm runs each (with #792 / `MATERIALIZED` applied): | LIMIT | runs (ms) | mean | |---|---|---| | 5000 | 86, 97, 92 | 92 ms | | 200 | 43, 31, 23 | **32 ms** (~2.9×) | Stacks with #792. `GetTracks` and `GetPlaylists` together represent ~310M calls / 4.5B ms in `pg_stat_statements`. ## Risk - A user whose only follower-of-X reposter is ranked >200 by `follower_count` will no longer surface in `followee_reposts` / `followee_favorites`. Acceptable trade-off — those low-fanout reposts are already dominated by the top-200 in the rendered top-3 social proof. - No correctness change for the >99% of users with <200 follows. Existing `TestGetTrackPersonalization` (added in #792) and the rest of the suite cover the personalization shape. ## Test plan - [x] `go test -count=1 ./api/...` (full suite, all green) - [x] EXPLAIN ANALYZE on read replica shows ~2.9× warm-cache speedup - [x] Local server hits `/v1/tracks/trending?user_id=Wem1e` (Phuture, 1752 follows) — 400-600ms warm ## Stacks on - #792 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
MATERIALIZEDto themy_followsCTE inget_tracks.sqlandget_playlists.sql. Without it, Postgres inlines the CTE and re-runs thefollows JOIN aggregate_user + ORDER BY follower_countonce per SubPlan invocation in thefollowee_reposts/followee_favoritesprojections.has_current_user_*,followee_reposts, andfollowee_favorites(no prior coverage).Why
GetTracksis the most-called query in the API by total time — 268M calls / 3.1B ms inpg_stat_statements. It runs once per request that returns track data (trending, feed, search results, my-favorites, …), and it's where personalized fields are computed.Impact
Verified on the prod read replica with user 20 (1752 follows) and a 10-track id list:
GetPlaylistsshares the same CTE and gets the same fix.Risk
MATERIALIZEDis a planner directive — query results are unchanged. Existing test suite passes (go test -count=1 ./api/...), and the newTestGetTrackPersonalizationconfirms the personalization fields populate correctly with?user_id=.Test plan
go test -count=1 ./api/...(full suite, all green)/v1/tracks/trending?user_id=Wem1e(Phuture, 1752 follows) — 500ms warm vs 480ms unauth, basically free personalizationTestGetTrackPersonalizationexercises both me-perspective flags and followee-perspective arrays🤖 Generated with Claude Code