Skip to content

Latest commit

 

History

History
161 lines (142 loc) · 14.6 KB

File metadata and controls

161 lines (142 loc) · 14.6 KB

Threat Model Analysis for apache/airflow

  • Threat: DAG Code Injection via Airflow Features (e.g., Connections, Variables)

    • Description: An attacker leverages Airflow's own features, such as Connections or Variables, to inject malicious code. For example, they might store a malicious script in a Variable intended to be used as a file path, or inject code into a Connection string that is later used by an operator to execute commands. This differs from the previous Git-based injection because it exploits intended Airflow functionality.
    • Impact:
      • Complete compromise of the Airflow environment and potentially connected systems.
      • Data breaches, data loss, data corruption.
      • System downtime and disruption of business processes.
      • Reputational damage.
    • Affected Airflow Component: DAG files (Python scripts), DagBag class, Operators that use Connections or Variables (especially custom operators), the process that parses and loads DAGs, and the rendering of templates within DAGs.
    • Risk Severity: Critical
    • Mitigation Strategies:
      • Input Validation:
        • Strictly validate all user-provided input to Connections and Variables. Do not assume that data stored in these features is safe.
        • Use allow-lists rather than deny-lists for validation. Define exactly what is permitted, and reject anything else.
        • Consider the context in which the data will be used. For example, if a Variable is supposed to be a file path, validate that it is a valid and safe file path.
      • Operator Security:
        • When developing custom operators, be extremely careful about how you use Connections and Variables. Avoid directly executing code from these sources.
        • Use parameterized queries or prepared statements when interacting with databases or other systems.
        • Sanitize any data retrieved from Connections or Variables before using it in commands or scripts.
      • Template Security:
        • Be aware of the potential for template injection vulnerabilities when using Jinja2 templating in DAGs.
        • Sanitize user-provided data before rendering it in templates.
        • Consider using a sandboxed environment for rendering templates.
      • Least Privilege:
        • Ensure that Airflow workers run with the minimum necessary privileges. Avoid running them as root.
      • Secrets Backends: Prefer using external secrets backends (Vault, AWS Secrets Manager, etc.) over Airflow Variables for highly sensitive data.
  • Threat: Secrets Exposure via Environment Variables (within Airflow's control)

    • Description: While the general threat of environment variable exposure exists, this focuses on Airflow's role. If Airflow operators (especially custom operators) are designed to read secrets from environment variables without proper safeguards, an attacker who compromises a worker (even with limited privileges) can access those secrets. This is distinct from a general system compromise; it's about how Airflow uses the environment.
    • Impact:
      • Unauthorized access to sensitive data and systems.
      • Data breaches.
      • Potential for lateral movement within the network.
    • Affected Airflow Component: Worker nodes; specifically, the environment in which tasks are executed. Custom operators that interact with environment variables are the primary concern. The subprocess module (if used insecurely) is a high-risk area.
    • Risk Severity: High
    • Mitigation Strategies:
      • Secrets Management (Airflow-Specific):
        • Strongly prefer using Airflow's built-in secrets management features (Connections, Variables, and especially Secrets Backends) over environment variables for storing secrets.
        • Educate developers on the risks of using environment variables for secrets within Airflow.
      • Operator Security:
        • Never design custom operators to directly read secrets from environment variables without a very strong justification and robust security measures.
        • If environment variables must be used (e.g., for legacy compatibility), ensure they are:
          • Accessed only through secure, well-vetted helper functions.
          • Never logged or exposed in task output.
          • Used in a way that minimizes the risk of exposure (e.g., short-lived processes).
      • Airflow Configuration: Review airflow.cfg and ensure that no sensitive information is inadvertently exposed in environment variables set by Airflow itself.
  • Threat: Denial of Service via DAG Bomb (Exploiting Airflow's Scheduling)

    • Description: An attacker submits a DAG specifically designed to overwhelm Airflow's scheduler or workers. This isn't just about general resource exhaustion; it's about exploiting Airflow's scheduling logic. Examples include:
      • DAGs with extremely high concurrency settings (max_active_runs, max_active_tasks).
      • DAGs that create a massive number of tasks dynamically.
      • DAGs that trigger external systems in a way that causes a cascading failure.
      • DAGs that abuse ExternalTaskSensor or similar operators to create dependency loops or wait indefinitely.
    • Impact:
      • Airflow scheduler or worker nodes become unresponsive or crash.
      • Legitimate DAGs cannot be scheduled or executed.
      • System downtime and disruption of business processes.
    • Affected Airflow Component: Scheduler, Workers, Metadata Database. The scheduling loop, task execution engine, database connection pool, and inter-process communication mechanisms are all potential targets.
    • Risk Severity: High
    • Mitigation Strategies:
      • DAG-Level Controls (Enforced):
        • Enforce strict limits on max_active_runs and max_active_tasks at the Airflow configuration level, preventing DAG authors from overriding them to dangerously high values.
        • Implement a review process for any DAG that attempts to use high concurrency settings.
      • Resource Limits (Airflow-Specific):
        • Configure appropriate resource limits (CPU, memory) for Airflow processes (scheduler, workers) using settings like worker_concurrency, parallelism, and resource requests/limits in Kubernetes. This is about limiting Airflow's own resource consumption, not just the tasks it runs.
      • Task Timeouts:
        • Enforce reasonable timeouts for all tasks, preventing them from running indefinitely and consuming resources.
      • Dynamic Task Generation Limits:
        • Implement safeguards to prevent DAGs from dynamically generating an unbounded number of tasks. This might involve custom code or configuration to limit the scale of dynamic task creation.
      • Operator Review:
        • Carefully review custom operators for potential resource exhaustion vulnerabilities.
      • Monitoring and Alerting (Airflow-Specific Metrics):
        • Monitor Airflow-specific metrics, such as the number of queued tasks, running tasks, scheduler heartbeat, and database connection pool usage.
        • Set up alerts for anomalies that could indicate a DAG bomb attack.
  • Threat: Unauthorized DAG Triggering via Airflow API (Exploiting Airflow's Authentication/Authorization)

    • Description: An attacker exploits weaknesses in Airflow's own API authentication or authorization mechanisms to trigger DAGs they should not have access to. This is distinct from a general API security issue; it's about how Airflow implements and enforces access control.
    • Impact:
      • Execution of unauthorized workflows.
      • Potential for data breaches, data corruption, or system disruption, depending on the actions performed by the triggered DAGs.
    • Affected Airflow Component: Airflow REST API, specifically the endpoints related to DAG runs and task instances. The authentication and authorization logic within Airflow (e.g., the auth module, RBAC configuration) is the critical area.
    • Risk Severity: High
    • Mitigation Strategies:
      • Strong Authentication (Airflow-Specific):
        • Use a robust authentication backend for the Airflow API (e.g., LDAP, OAuth, or a well-configured custom backend). Avoid weak or default authentication methods.
        • Enforce strong password policies.
        • Consider multi-factor authentication (MFA) for API access, especially for privileged users.
      • RBAC (Airflow-Specific):
        • Implement fine-grained Role-Based Access Control (RBAC) within Airflow to restrict API access based on user roles and permissions.
        • Regularly review and audit RBAC configurations to ensure they adhere to the principle of least privilege.
        • Test RBAC thoroughly to ensure it is enforced correctly.
      • API Key Management (Airflow-Specific):
        • If using API keys, manage them securely:
          • Generate unique keys for each user or application.
          • Store keys securely (e.g., using a secrets management system).
          • Regularly rotate keys.
          • Implement mechanisms to revoke compromised keys.
      • Auditing (Airflow-Specific):
        • Enable detailed audit logging for all API requests, including information about the user, the endpoint accessed, and the result. Ensure this logging is integrated with Airflow's logging system.
  • Threat: Unpatched Airflow Vulnerability Exploitation (Directly Affecting Airflow)

    • Description: An attacker exploits a known vulnerability within Airflow itself (or a core dependency that Airflow directly uses and exposes) to gain unauthorized access or control. This is about vulnerabilities in Airflow's codebase, not the surrounding infrastructure.
    • Impact:
      • Varies depending on the specific vulnerability, but could range from information disclosure to complete system compromise.
    • Affected Airflow Component: Potentially any component of Airflow, depending on the vulnerability. This could include the webserver, scheduler, worker, database interactions, specific operators, or the API.
    • Risk Severity: Critical (if a known, exploitable vulnerability exists and is unpatched)
    • Mitigation Strategies:
      • Patch Management (Prioritized for Airflow):
        • Prioritize patching Airflow and its core dependencies. This is more critical than patching general system libraries (though those should also be patched).
        • Establish a rapid response process for critical Airflow security updates.
      • Vulnerability Scanning (Focused on Airflow):
        • Use vulnerability scanners that specifically target Airflow and its known vulnerabilities.
      • Security Monitoring (Airflow-Specific):
        • Actively monitor Airflow security advisories and CVE databases for vulnerabilities that directly affect Airflow.
        • Subscribe to Airflow security mailing lists and community forums.
  • Threat: Metadata Database Compromise (Through Airflow Misconfiguration)

    • Description: An attacker gains access to the Airflow metadata database because of a misconfiguration within Airflow itself. This is distinct from a general database attack. Examples include:
      • Airflow configured with a weak or default database password.
      • Airflow configured to connect to a database without encryption.
      • Airflow's database connection string exposed through a vulnerability in a custom operator or logging.
      • Airflow configured with overly permissive database user privileges.
    • Impact:
      • Complete control over the Airflow environment.
      • Ability to modify DAGs, trigger tasks, and access sensitive data.
      • Data breaches, data loss, data corruption.
      • System downtime.
    • Affected Airflow Component: Metadata Database (e.g., PostgreSQL, MySQL). The database connection configuration in airflow.cfg and the ORM used by Airflow (SQLAlchemy) are relevant. The security of custom operators that interact with the database is also critical.
    • Risk Severity: Critical
    • Mitigation Strategies:
      • Secure Database Configuration (within airflow.cfg):
        • Use a strong, unique, and randomly generated password for the Airflow database user. Never use a default password.
        • Configure Airflow to connect to the database using encryption (TLS/SSL).
        • Ensure the database connection string is not exposed in logs or error messages.
      • Least Privilege (Database User):
        • Grant the Airflow database user only the minimum necessary privileges within the database. Avoid granting overly permissive roles (e.g., superuser).
      • Operator Security (Database Interactions):
        • If custom operators interact with the metadata database (or any database), ensure they use parameterized queries or prepared statements to prevent SQL injection.
        • Avoid exposing database connection details within custom operators.
      • Regular Audits (Airflow Configuration):
        • Regularly audit the airflow.cfg file and any environment variables related to database connectivity to ensure they are secure.