Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ruler: Make alertmanager client and all related config per-tenant configurable #10816

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

alexweav
Copy link
Contributor

@alexweav alexweav commented Mar 5, 2025

What this PR does

This PR adds a new per-tenant configuration block that allows the alertmanager URL and all related client options to be changed on a per-tenant basis. This allows different ruler tenants to send alerts to different alertmanagers.

There are a few deployment topologies where this might be desirable:

  • Moving a tenant from one Mimir cell to another. If the new cell is pointed at a different alertmanager (or has its own alertmanager deployed internally), you no longer have to migrate the tenant's alertmanager configuration in lock-step.
  • If you have multiple alertmanager environments with different auth parameters or URLs, you don't need to run separate mimir cells to map each tenant to each alertmanager.

The new config is a sub-block of the tenant overrides. An example might look like:

user1:
  ruler_alertmanager_client_config:
    alertmanager_url: http://custom-url-for-this-tenant:8080

If a tenant does not supply any tenant-specific config, it falls back to the global ruler-level configurations given by the existing parameters.

Note: This PR is part 1 of the change that proposes and applies the configuration. It does not yet implement hot-reload for this config block; you need to restart the ruler for changes to this config to take effect. I will support hot-reload, I just decided to post it in a separate PR to keep the diff reviewable.

Which issue(s) this PR fixes or relates to

n/a

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@alexweav alexweav force-pushed the alexweav/tenant-notifier-config branch from 398822f to 4a278d0 Compare March 6, 2025 16:32
Copy link
Contributor

github-actions bot commented Mar 6, 2025

@alexweav alexweav marked this pull request as ready for review March 6, 2025 21:42
@alexweav alexweav requested review from a team and tacole02 as code owners March 6, 2025 21:42
Copy link
Contributor

@tacole02 tacole02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@@ -64,6 +64,7 @@
* [ENHANCEMENT] All: Add `cortex_client_request_invalid_cluster_validation_labels_total` metrics, that is used by Mimir's gRPC clients to track invalid cluster validations. #10767
* [ENHANCEMENT] Ingester client: Add support to configure cluster validation for ingester clients. Failed cluster validations are tracked by `cortex_client_request_invalid_cluster_validation_labels_total` with label `client=ingester`. #10767
* [ENHANCEMENT] Add experimental metric `cortex_distributor_dropped_native_histograms_total` to measure native histograms silently dropped when native histograms are disabled for a tenant. #10760
* [ENHANCEMENT] Add tenant configuration block `ruler_alertmanager_client_config` which allows the Ruler's Alertmanager client options to be specified on a per-tenant basis. #10816
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* [ENHANCEMENT] Add tenant configuration block `ruler_alertmanager_client_config` which allows the Ruler's Alertmanager client options to be specified on a per-tenant basis. #10816
* [ENHANCEMENT] Add tenant configuration block `ruler_alertmanager_client_config`, which allows you to specify the ruler's Alertmanager client options on a per-tenant basis. #10816

@@ -3881,6 +3881,84 @@ The `limits` block configures default and per-tenant limits imposed by component
# CLI flag: -ruler.max-independent-rule-evaluation-concurrency-per-tenant
[ruler_max_independent_rule_evaluation_concurrency_per_tenant: <int> | default = 4]

# Per-tenant alertmanager client configuration. If not supplied, the tenant's
# notifications will be sent to the ruler-wide default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# notifications will be sent to the ruler-wide default.
# notifications are sent to the ruler-wide default.

@@ -3881,6 +3881,84 @@ The `limits` block configures default and per-tenant limits imposed by component
# CLI flag: -ruler.max-independent-rule-evaluation-concurrency-per-tenant
[ruler_max_independent_rule_evaluation_concurrency_per_tenant: <int> | default = 4]

# Per-tenant alertmanager client configuration. If not supplied, the tenant's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Per-tenant alertmanager client configuration. If not supplied, the tenant's
# Per-tenant Alertmanager client configuration. If not supplied, the tenant's

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants