Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] state:modified detection incorrect when config() block generated by macro with dynamic sql_header #11150

Open
2 tasks done
pbotros opened this issue Dec 13, 2024 · 2 comments
Labels
bug Something isn't working sql_header Issues related to `sql_header` config state Stateful selection (state:modified, defer)

Comments

@pbotros
Copy link

pbotros commented Dec 13, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

IF:

  • a model's content, including its config() block, is generated via a macro
  • and that macro produces a model with set_sql_header generating a dynamic header based on config or the model
  • and dbt is invoked with a different target with the --state flag,

then state:modified detection always returns that the model has changed. See "Steps to Reproduce" for a minimal reproducible example.

Expected Behavior

state:modified should not detect a change when no model sources or macros involved have changed.

Steps To Reproduce

$> cat models/foo.sql
{{ generate_foo() }}
$> cat macros/generate_foo.sql
{%- macro generate_foo() -%}

{{ config(
    materialized='table'
) }}

{% call set_sql_header(config) %}
declare some_var timestamp default (
    select MAX(some_col) from {{ this }}
);
{% endcall %}

select 1 as some_col

{%- endmacro -%}
$> rm -rf target{,_old} && dbt --target dev compile && mv target{,_old} && dbt --target prod ls --select state:modified --state target_old
17:53:01  Running with dbt=1.9.0
17:53:02  Registered adapter: bigquery=1.9.0
...
<project_name>.foo

This occurs even with state_modified_compare_more_unrendered_values set as true:

flags:
  # See https://docs.getdbt.com/reference/node-selection/state-comparison-caveats#false-positives
  state_modified_compare_more_unrendered_values: True 

The bug does not occur if set_sql_header in the macro has a constant body (e.g. select TIMESTAMP("1970-01-01")). The bug also does not occur if the model is not generated entirely via macro.

The bug can be avoided if a dummy config() statement is put in the model source:

$> cat models/foo_shim.sql
{{ config(materialized='whatever') }}
{{ generate_foo() }}

and foo_shim does not show as modified when running the same above statement.

Relevant log output

See above.

Environment

- OS: Mac 14.7.1
- Python: 3.12.1
- dbt: 1.9.0

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

@pbotros pbotros added bug Something isn't working triage labels Dec 13, 2024
@dbeatty10 dbeatty10 added state Stateful selection (state:modified, defer) sql_header Issues related to `sql_header` config labels Dec 13, 2024
@dbeatty10 dbeatty10 changed the title [Bug] state:modified detection incorrect when config() block generated by macro with dynamic sql_header [Bug] state:modified detection incorrect when config() block generated by macro with dynamic sql_header Dec 13, 2024
@dbeatty10
Copy link
Contributor

Thanks such a nice write-up @pbotros 🏆

I see what you are saying about state:modified in this case.

I saw the same issue as you when I tried this with dbt-duckdb.

The workarounds also worked for me with one lone exception: the config shim trick didn't work for me. But maybe it does specifically for dbt-bigquery -- I didn't try it out since it didn't affect the key thing you are raising here.

@dbeatty10 dbeatty10 removed the triage label Dec 14, 2024
@christopherbertrand
Copy link

fwiw, we can repro the above issue with dbt-redshift

We're also running into this when we use the pre_hook and post_hook configs in a snapshot where the schema is set by the generate_schema_name_for_env macro:

$> cat snapshots/foo_snap.sql
{% snapshot foo_snap %}

{{ config(
    pre_hook="set enable_case_sensitive_identifier to true;",
    post_hook="reset enable_case_sensitive_identifier;",
) }}

{{
  config(
    target_schema = generate_schema_name_for_env("snapshots"),
    strategy="check",
    unique_key="id",
    check_cols=["data"],
  )
}}

select 1 as id, 1 as data

{% endsnapshot %}
$> rm -rf target{,_old} && dbt --target dev compile && mv target{,_old} && dbt --target prod ls --select state:modified --state target_old
21:15:27  Running with dbt=1.8.9
21:15:27  Registered adapter: redshift=1.8.1
...
<project_name>.foo_snap.foo_snap

The config shim trick didn't work for us either

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working sql_header Issues related to `sql_header` config state Stateful selection (state:modified, defer)
Projects
None yet
Development

No branches or pull requests

3 participants