Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

frontend: More versatile configuration for subquery spin-off #10696

Closed

Conversation

julienduchesne
Copy link
Member

@julienduchesne julienduchesne commented Feb 19, 2025

Expands on #10460

Testing has shown nice results for the feature. In our main Mimir cell, the overall runtime of all instant queries containing the "subquery pattern" ([<range>:(<step>)?]) has gone down by half, and the feature allows for queries further into the past by sharding by time

Currently, this is only configurable with -query-frontend.spin-off-instant-subqueries-to-url + a required per-tenant instant_queries_with_subquery_spin_off opt-in (list of regexp)

This PR makes the configs more consistent and allows for a global enable. Here are the new configs:

  • -query-frontend.instant-subquery-spin-off.enabled. False by default, set to true to enable for all tenants. This is overridden with instant_subquery_spin_off_enabled on tenants. There is an extra list of regexp that can be given per-tenant as well: instant_subquery_spin_off_enabled_regexp. This can be useful to roll out the feature one query at a time
  • -query-frontend.instant-subquery-spin-off.min-range-duration. Defaults to 12h. This is overridden with instant_subquery_spin_off_min_range_duration on tenants
  • -query-frontend.instant-subquery-spin-off.url. The frontend config part of the feature. Required

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@julienduchesne julienduchesne force-pushed the julienduchesne/subquery-spin-off-more-flags branch from 4c61c6a to f5c92e5 Compare February 19, 2025 18:28
Copy link
Contributor

github-actions bot commented Feb 19, 2025

@julienduchesne julienduchesne force-pushed the julienduchesne/subquery-spin-off-more-flags branch from f5c92e5 to 6c162c8 Compare February 19, 2025 18:33
@julienduchesne julienduchesne marked this pull request as ready for review February 19, 2025 18:36
@julienduchesne julienduchesne requested review from tacole02 and a team as code owners February 19, 2025 18:36
@julienduchesne julienduchesne force-pushed the julienduchesne/subquery-spin-off-more-flags branch 3 times, most recently from 2af004d to d99aee2 Compare February 19, 2025 19:03
Expands on #10460

Testing has shown nice results for the feature. In our main Mimir cell, the overall runtime of all instant queries containing the "subquery pattern" (`[<range>:(<step>)?]`) has gone down by half, and the feature allows for queries further into the past by sharding by time

Currently, this is only configurable with `-query-frontend.spin-off-instant-subqueries-to-url` + a required per-tenant `instant_queries_with_subquery_spin_off`` opt-in (list of regexp)

This PR makes the configs more consistent and allows for a global enable. Here are the new configs:
- `-query-frontend.instant-subquery-spin-off.enable-by-default`. False by default, set to true to enable for all tenants. This is overriden with `instant_subquery_spin_off_enabled_regexp` on tenants
- `-query-frontend.instant-subquery-spin-off.min-range-duration`. Defaults to 12h. This is overriden with `instant_subquery_spin_off_min_range_duration` on tenants
- `-query-frontend.instant-subquery-spin-off.url`. The frontend config part of the feature. Required
@julienduchesne julienduchesne force-pushed the julienduchesne/subquery-spin-off-more-flags branch from d99aee2 to 2790bf8 Compare February 19, 2025 19:04
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes look ok, but it's not super clear if you intend to keep these configs long-term or they are for some validaiton testing now. Maybe you can include this in the PR description.

I don't think they are of great use from an operator's perspective and I'm assuming you're adding them so you can progressively roll this out.

},
{
"kind": "field",
"name": "instant_subquery_spin_off_url",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i haven't been involved so far, but IIRC this extra flag is just for the current testing, right? not planned to be kept long-term?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed. The feature is about spinning off subqueries as actual range queries. We do that by sending an entirely new request to a given URL. Mimir doesn't really know the URL of the frontend service, as Mimir can run in various set-ups

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i didn't know we go over the HTTP boundary again. Can't we just send it to localhost? What is this pointed to otherwise?

MaxFutureQueryWindow model.Duration `yaml:"max_future_query_window" json:"max_future_query_window" category:"experimental"`

// Subquery Spin-off
InstantSubquerySpinOffEnabledRegexp []string `yaml:"instant_subquery_spin_off_enabled_regexp" json:"instant_subquery_spin_off_enabled_regexp" doc:"nocli|description=List of regular expression patterns matching instant queries. Subqueries within those instant queries will be spun off as range queries to optimize their performance." category:"experimental"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you need this as a global setting as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that it's a slice doesn't lend itself well to a CLI arg. I think the current model of being able to enable/disable per-tenant + this extra per-tenant override that isn't available as a flag is enough configurability

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julienduchesne
Copy link
Member Author

I don't think they are of great use from an operator's perspective and I'm assuming you're adding them so you can progressively roll this out.

Yes, these new flags will allow me to roll this out at large. The feature is still being tested. When things are settled, I will create a configure page, like this one for example: https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/configure-blocked-queries.md

@julienduchesne julienduchesne requested a review from a team February 19, 2025 20:51
Copy link
Contributor

@tacole02 tacole02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! One minor question!

@@ -46,7 +46,7 @@
* [ENHANCEMENT] Distributor, querier, ingester and store-gateway: Add support for `limit` parameter for label names and values requests. #10410
* [ENHANCEMENT] Query-frontend: Allow adjustment of queries looking into the future to a maximum duration with experimental `-query-frontend.max-future-query-window` flag. #10547
* [ENHANCEMENT] Ruler: Adds support for filtering results from rule status endpoint by `file[]`, `rule_group[]` and `rule_name[]`. #10589
* [ENHANCEMENT] Query-frontend: Add option to "spin off" subqueries as actual range queries, so that they benefit from query acceleration techniques such as sharding, splitting and caching. To enable this, set the `-query-frontend.spin-off-instant-subqueries-to-url=<url>` option on the frontend and the `instant_queries_with_subquery_spin_off` per-tenant override with regular expressions matching the queries to enable. #10460 #10603 #10621
* [ENHANCEMENT] Query-frontend: Add option to "spin off" subqueries as actual range queries, so that they benefit from query acceleration techniques such as sharding, splitting and caching. To enable this, set the `-query-frontend.instant-subquery-spin-off.url=<url>` option on the frontend and the `instant_subquery_spin_off_enabled_regexp` per-tenant override with regular expressions matching the queries to enable. #10460 #10603 #10621 #10696
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* [ENHANCEMENT] Query-frontend: Add option to "spin off" subqueries as actual range queries, so that they benefit from query acceleration techniques such as sharding, splitting and caching. To enable this, set the `-query-frontend.instant-subquery-spin-off.url=<url>` option on the frontend and the `instant_subquery_spin_off_enabled_regexp` per-tenant override with regular expressions matching the queries to enable. #10460 #10603 #10621 #10696
* [ENHANCEMENT] Query-frontend: Add option to "spin off" subqueries as actual range queries, so that they benefit from query acceleration techniques such as sharding, splitting, and caching. To enable this feature, set the `-query-frontend.instant-subquery-spin-off.url=<url>` option on the frontend and the `instant_subquery_spin_off_enabled_regexp` per-tenant override with regular expressions matching the queries you want to enable. #10460 #10603 #10621 #10696

@@ -215,7 +215,7 @@ The following features are currently experimental:
- Server-side write timeout for responses to active series requests (`-query-frontend.active-series-write-timeout`)
- Caching of non-transient error responses (`-query-frontend.cache-errors`, `-query-frontend.results-cache-ttl-for-errors`)
- Blocking HTTP requests on a per-tenant basis (configured with the `blocked_requests` limit)
- Spinning off (as actual range queries) subqueries from instant queries (`-query-frontend.spin-off-instant-subqueries-to-url` and the `instant_queries_with_subquery_spin_off` per-tenant limit)
- Spinning off (as actual range queries) subqueries from instant queries (`-query-frontend.instant-subquery-spin-off.url` and the `instant_subquery_spin_off_enabled_regexp` per-tenant limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Spinning off (as actual range queries) subqueries from instant queries (`-query-frontend.instant-subquery-spin-off.url` and the `instant_subquery_spin_off_enabled_regexp` per-tenant limit)
- Spinning off subqueries from instant queries as actual range queries (`-query-frontend.instant-subquery-spin-off.url` and the `instant_subquery_spin_off_enabled_regexp` per-tenant limit)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to eliminate the double parenthesis for clarity. Let me know if this works!

@dimitarvdimitrov
Copy link
Contributor

I don't think they are of great use from an operator's perspective and I'm assuming you're adding them so you can progressively roll this out.

Yes, these new flags will allow me to roll this out at large. The feature is still being tested. When things are settled, I will create a configure page, like this one for example: https://github.com/grafana/mimir/blob/main/docs/sources/mimir/configure/configure-blocked-queries.md

I was more concerned about all the slices and min-durations and especially the URL. Would be nice if this feature just works and doesn't need any fiddling with flags

@julienduchesne
Copy link
Member Author

I will test out using localhost in our cells. I've given it more thought and I think what you're saying makes sense. We could get away with only the regexp (one flag/user limit)

@julienduchesne
Copy link
Member Author

Superseded by #10703. Thanks for the review @dimitarvdimitrov. Let's go with the simpler config

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants