Attack Surface: Unauthorized Log Ingestion
- Description: Attackers can send arbitrary log data to the Loki push API without proper authentication or authorization.
- How Loki Contributes: Loki's push API (
/loki/api/v1/push
) is the primary entry point for log data. If not secured, it's an open door. - Example: An attacker discovers the Loki endpoint and uses
curl
or a script to send fake log entries, claiming a successful database backup when none occurred. - Impact: Data poisoning, denial of service (DoS), masking of real attacks, potential triggering of false alerts.
- Risk Severity: Critical
- Mitigation Strategies:
- Strong Authentication: Implement mandatory authentication for all clients pushing logs. Use API keys, JWTs (with tenant/stream claims), or mutual TLS. Do not rely solely on network-level restrictions.
- Fine-Grained Authorization: Define authorization rules that restrict which clients can push to specific streams or tenants. Use Loki's built-in features or integrate with an external authorization service (e.g., OPA).
- Regular Audits: Periodically review authentication and authorization configurations.
Attack Surface: Log Data Injection/Manipulation (Focus on DoS)
- Description: A compromised or malicious client sends crafted log entries, specifically focusing on oversized payloads to cause resource exhaustion. (Narrowed scope for direct Loki impact).
- How Loki Contributes: Loki's ingester must process and store all received data. Extremely large log entries can overwhelm it.
- Example: An attacker sends log entries with megabytes of data in each entry, causing the Loki ingester to run out of memory or disk space.
- Impact: Denial of service (DoS).
- Risk Severity: High
- Mitigation Strategies:
- Rate Limiting: Implement strict rate limits on the push API, both globally and per client/tenant.
- Payload Size Limits: Enforce maximum payload size limits on the push API. This is a direct configuration option within Loki.
- Resource Limits: Configure appropriate CPU, memory, and disk space limits on the Loki ingester components.
Attack Surface: Denial of Service (DoS) on Ingestion
- Description: Attackers flood the Loki push API with requests, overwhelming the ingester and preventing legitimate log processing.
- How Loki Contributes: The ingester component is responsible for receiving and processing log data. It's a potential bottleneck.
- Example: An attacker uses a botnet to send a massive number of log entries to the Loki push API, causing the ingester to crash or become unresponsive.
- Impact: Loss of log data, disruption of monitoring and alerting, potential impact on applications relying on log data.
- Risk Severity: High
- Mitigation Strategies:
- Rate Limiting: (As above) - Crucial for preventing DoS.
- Resource Limits: Configure appropriate CPU, memory, and connection limits on the Loki ingester components.
- Horizontal Scaling: Deploy multiple Loki ingester instances behind a load balancer to distribute the load.
- Backpressure: Implement backpressure mechanisms.
- Queueing: Use a message queue (e.g., Kafka) in front of the ingester (though this is indirectly related to Loki).
Attack Surface: Unauthorized Data Access (Query API)
- Description: Attackers gain access to log data they shouldn't see via the Loki query API.
- How Loki Contributes: Loki's query API (
/loki/api/v1/query
,/loki/api/v1/query_range
, etc.) provides read access to stored log data. - Example: An attacker discovers the Loki query endpoint and, without authentication, retrieves sensitive log data from other tenants in a multi-tenant environment.
- Impact: Data breaches, privacy violations, potential exposure of sensitive information.
- Risk Severity: Critical
- Mitigation Strategies:
- Strong Authentication: Enforce mandatory authentication for all query requests.
- Fine-Grained Authorization: Implement authorization rules. Leverage Loki's multi-tenancy features and consider external authorization services.
- Audit Logging: Enable audit logging for all query API requests.
Attack Surface: Denial of Service (DoS) via Expensive Queries
- Description: Attackers craft complex or resource-intensive LogQL queries to overwhelm the querier.
- How Loki Contributes: The querier component is responsible for executing LogQL queries. Complex queries can consume significant resources.
- Example: An attacker submits a query with a very large time range, a complex regular expression, and high-cardinality label lookups, causing the querier to consume all available CPU and memory.
- Impact: Denial of service, slow query performance, potential impact on other users.
- Risk Severity: High
- Mitigation Strategies:
- Query Timeouts: Implement strict timeouts for all LogQL queries.
- Time Range Limits: Limit the maximum time range that can be queried in a single request.
- Resource Quotas: Implement resource quotas (CPU, memory) for queries, potentially on a per-user or per-tenant basis.
- Monitoring: Monitor query performance and identify slow or resource-intensive queries.
Attack Surface: Configuration Exposure (Directly Affecting Loki)
- Description: Sensitive information in the Loki configuration file (e.g., storage credentials, API keys used by Loki) is exposed.
- How Loki Contributes: The configuration file controls Loki's behavior and contains sensitive information directly used for Loki's operation.
- Example: An attacker gains access to the server and finds the
loki-config.yaml
file, containing unencrypted authentication details for the Loki API itself. - Impact: Compromise of the Loki service, unauthorized access to log data.
- Risk Severity: Critical
- Mitigation Strategies:
- Secure Storage: Store the configuration file securely with restricted access.
- Secrets Management: Avoid hardcoding sensitive credentials. Use environment variables or a secrets management system.
- Regular Review: Periodically review the configuration file.
Attack Surface: Tenant Isolation Failure (Multi-tenancy)
- Description: In a multi-tenant Loki deployment, one tenant gains access to the log data of another tenant.
- How Loki Contributes: Loki provides multi-tenancy features to isolate log data. Misconfiguration can break this isolation.
- Example: A misconfigured
auth_enabled
setting or incorrect tenant ID header allows a user from one tenant to query logs from another tenant. - Impact: Data breaches, privacy violations, loss of trust.
- Risk Severity: Critical
- Mitigation Strategies:
- Thorough Testing: Rigorously test the tenant isolation mechanisms.
- Regular Audits: Periodically audit the multi-tenancy configuration.
- Strong Authentication/Authorization: Enforce strong authentication and authorization.
- Monitoring: Monitor for any unusual cross-tenant activity.