Improve clickhouse query in metric alert evaluator#8131
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces target-keyed materialized views (operations_minutely_by_target and operations_hourly_by_target) to optimize unfiltered metric-alert queries by avoiding full table scans, and updates the evaluator to route queries to these rollups when no filter conditions are present. Feedback on these changes highlights a critical issue in the migration: using SummingMergeTree with AggregateFunction columns will corrupt the average and quantile metrics during merges; the engine should be changed to AggregatingMergeTree with SimpleAggregateFunction for the sum columns. Additionally, a defensive check is recommended in the evaluator to ensure the rollup table is only selected when there are indeed no filter conditions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
🚀 Snapshot Release (
|
| Package | Version | Info |
|---|---|---|
@graphql-hive/apollo |
0.48.1-alpha-20260610132951-a65882318f57792c812f186e092d051b1fd46862 |
npm ↗︎ unpkg ↗︎ |
@graphql-hive/cli |
0.60.1-alpha-20260610132951-a65882318f57792c812f186e092d051b1fd46862 |
npm ↗︎ unpkg ↗︎ |
@graphql-hive/core |
0.21.1-alpha-20260610132951-a65882318f57792c812f186e092d051b1fd46862 |
npm ↗︎ unpkg ↗︎ |
@graphql-hive/envelop |
0.40.6-alpha-20260610132951-a65882318f57792c812f186e092d051b1fd46862 |
npm ↗︎ unpkg ↗︎ |
@graphql-hive/yoga |
0.48.1-alpha-20260610132951-a65882318f57792c812f186e092d051b1fd46862 |
npm ↗︎ unpkg ↗︎ |
hive |
11.3.0-alpha-20260610132951-a65882318f57792c812f186e092d051b1fd46862 |
npm ↗︎ unpkg ↗︎ |
|
🐋 This PR was built and pushed to the following Docker images: Targets: Platforms: Image Tag: |
jdolle
left a comment
There was a problem hiding this comment.
Let's get another set of eyes on this but it's looking good to me.
n1ru4l
left a comment
There was a problem hiding this comment.
Looks good with me. Added two comments
This PR speeds up the metric-alert evaluator's ClickHouse query for unfiltered rules by reading from new target-keyed rollups instead of the full per-hash/client tables.
The evaluator's window query sums across all operations for a target over a time range. The existing
operations_minutely/operations_hourlytables are ordered (target, hash, client_name, client_version, timestamp) (timestamp is last) andoperations_minutelyisPARTITION BY tuple(), so that query can't prune by time and ends up scanning the target's entire slice (up to 24h of per-hash/client rows for a 5-minute alert).This adds two tables fed by materialized views,
operations_minutely_by_targetandoperations_hourly_by_target, that re-aggregate default.operations keyed on (target, timestamp) only, partitioned by hour (minutely) / by day (hourly), ordered (target, timestamp).Routing is keyed on
filterConditions.length === 0...rules with a saved filter keep using the existing tables, where the hash/client predicate already exploits the sort-key prefix and the rollups (which drop those dimensions) wouldn't be usable.