- Description: An attacker sends forged metric data to Cortex, impersonating a legitimate source (e.g., a monitored application). The attacker crafts HTTP requests with manipulated
X-Scope-OrgID
headers (if multi-tenancy is enabled) or other identifying information.- Impact:
- Incorrect alerts are triggered or suppressed.
- Dashboards display misleading information.
- Autoscaling decisions are based on false data.
- Capacity planning is inaccurate.
- Affected Component:
Distributor
,Ingester
,remote_write
API endpoint. - Risk Severity: High
- Mitigation Strategies:
- Authentication: Require authentication for all metric submissions (e.g., API keys, mTLS).
- Authorization: Enforce authorization policies (only authorized sources can submit metrics for specific tenants/namespaces).
- Input Validation: Validate the format and content of incoming metric data.
- Rate Limiting: Implement rate limiting per source.
- Network Segmentation: Isolate the ingestion path.
- Anomaly Detection: Monitor for unusual patterns in metric submissions.
- Impact:
- Description: A vulnerability or misconfiguration in Cortex's multi-tenancy implementation allows one tenant to access data belonging to another. This could be due to bugs in query processing, incorrect
X-Scope-OrgID
handling, or shared resources not properly partitioned.- Impact:
- Confidentiality breach: Sensitive data exposed.
- Compliance violations (GDPR, HIPAA).
- Reputational damage.
- Affected Component:
Querier
,Query Frontend
,Store Gateway
, any component handling multi-tenant data; functions related to query parsing, execution, and result filtering. - Risk Severity: Critical
- Mitigation Strategies:
- Strict Tenant Isolation: Enforce isolation at all layers (query processing, storage, caching).
- Thorough Testing: Extensive testing, including fuzzing and penetration testing.
- Code Reviews: Review code related to multi-tenancy.
- Least Privilege: Components have only necessary permissions.
- Regular Audits: Audit multi-tenancy configuration and implementation.
- Formal Verification (where feasible): Consider formal verification.
- Impact:
- Description: An attacker floods Cortex with metric data, exceeding the capacity of ingesters or distributors. This can be done via many requests, large samples, or exploiting a vulnerability.
- Impact:
- Metric data loss.
- Service degradation.
- Outage.
- Affected Component:
Distributor
,Ingester
, network infrastructure. - Risk Severity: High
- Mitigation Strategies:
- Rate Limiting: Strict rate limiting per tenant and source.
- Resource Quotas: Enforce quotas (CPU, memory, storage) per tenant.
- Horizontal Scaling: Multiple instances of distributors and ingesters.
- Load Shedding: Gracefully shed load when overloaded.
- Input Validation: Limit size and complexity of metric data.
- Monitoring: Monitor resource usage; alerts for high utilization.
- Circuit Breakers: Prevent cascading failures.
- Impact:
- Description: An attacker crafts complex queries that consume excessive resources on queriers or the query-frontend. This could involve large time ranges, many series, complex regex, or expensive computations.
- Impact:
- Slow query performance.
- Querier/query-frontend crashes.
- Denial of service for other users.
- Affected Component:
Querier
,Query Frontend
,Store Gateway
. - Risk Severity: High
- Mitigation Strategies:
- Query Limits: Limits on time range, series count, complexity.
- Query Timeout: Timeouts for queries.
- Resource Quotas: Limit resources a single query can consume.
- Query Analysis: Identify and reject malicious queries.
- Caching: Cache frequent query results.
- Horizontal Scaling: Multiple instances of queriers/query-frontends.
- Monitoring: Monitor query performance and resource usage.
- Impact:
- Description: An attacker gains unauthorized access to the Cortex configuration store (etcd, Consul) and modifies the configuration, potentially changing alerting rules, disabling security, modifying storage, or lowering rate limits.
- Impact:
- Alerts suppressed/misdirected.
- Security controls bypassed.
- Data loss/corruption.
- Denial of service.
- Affected Component: All Cortex components; configuration loading and validation mechanisms.
- Risk Severity: Critical
- Mitigation Strategies:
- Secure Configuration Store: Protect with strong authentication/authorization.
- Access Control: Restrict access to authorized users/services.
- Auditing: Log all configuration changes.
- Configuration Validation: Validate before applying.
- Regular Backups: Back up the configuration.
- Integrity Checks: Detect unauthorized modifications.
- Principle of Least Privilege: Grant only necessary permissions.
- Impact:
Threat: Rule Modification (Tampering)
- Description: An attacker gains access to modify alert rules stored in the Ruler, either through direct storage access or by exploiting a vulnerability in the Ruler's API.
- Impact:
- Suppression of legitimate alerts.
- Generation of false alerts.
- Redirection of alerts.
- Affected Component:
Ruler
, rule storage and management functions. - Risk Severity: High
- Mitigation Strategies:
- Authentication and Authorization: Secure the Ruler's API.
- Access Control: Restrict access to Ruler's storage.
- Auditing: Log all rule changes.
- Input Validation: Validate rule syntax and content.
- Rule Versioning: Allow rollback to previous versions.
- Integrity Checks: Verify integrity of stored rules.
- Impact: