Fix/batch inprogress mutations#1431
Open
ruslanen wants to merge 2 commits into
Open
Conversation
…bles GetInProgressMutations was called once per table during `create`. Each query against system.mutations enumerates every table on the server, so the per-table WHERE database/table filter does not bound the work: cost is O(total tables) per call * N calls = O(N^2). On installations with many tables this query family dominates `create` wall-clock (observed ~240ms/call across tens of thousands of tables). Fetch the whole in-progress mutation set once per backup via a new GetInProgressMutationsBatch (single system.mutations scan, same WHERE is_done=0 filter) and look it up per table from an in-memory map keyed by "database.table". Behavior is unchanged (same per-table Mutations in TableMetadata); only the query count changes (N -> 1). Measured on a 55k-table / 93.6GiB local backup: create ~35min -> ~230s.
Extract groupMutationsByTable (pure, no I/O) from GetInProgressMutationsBatch and add unit tests: two tables (one with two mutations) bucket to the correct database.table with no cross-table leakage, plus an empty-input case.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Batch system.mutations lookup to avoid O(N²) on installations with many tables
Problem
During
create, in-progress mutations are fetched once per table viaGetInProgressMutations(ctx, database, table):system.mutationsis a virtual table: every query against it enumerates all tables on theserver, so the per-table
WHERE database/tablefilter does not bound the work that ClickHouseactually does. With N tables in the backup we issue N such queries, and each one costs
~O(total tables) → overall O(N²).
On installations with many tables this single query family dominates
createwall-clock. Observedon a real cluster: ~240 ms per call across tens of thousands of tables, collapsing
createthroughput from ~250 tables/s to ~26 tables/s.
Fix
Fetch the whole in-progress mutation set once per backup with a single scan:
GetInProgressMutationsBatchreturns amap["database.table"][]MutationMetadata; the per-tablecode path now does an in-memory map lookup instead of a query. Query count drops from N → 1.
Behavior unchanged
WHERE is_done=0filter as before.Mutationswritten intoTableMetadata.BackupMutationsenabled and not schema/rbac/configs/named-collections-only).
Tests
Added
TestGroupMutationsByTable(+ empty-input case) covering the puregroupMutationsByTablehelper: rows from one server-wide scan are bucketed to the correct
database.tablewith nocross-table leakage and in stable order.
Impact (measured)
55k-table / 93.6 GiB local backup:
create~35 min → ~230 s.Files
pkg/clickhouse/clickhouse.go—GetInProgressMutationsBatch,inProgressMutationRow,groupMutationsByTablepkg/backup/create.go— single batch call, per-table map lookuppkg/clickhouse/clickhouse_test.go— unit tests