-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support "relative" config for begin
on microbatch models
#11270
Comments
Hi! Thanks so much for opening this feature request. If you want to do a relative date for |
@graciegoheen not fully related but "kinda" is there a way to declare a relative begin AND have it not mark the model as modified? I want to be able to run microbatch models with variable configs based on the target however it means anytime I compare dev to prod the microbatch model is marked as modified. Example
Would also be great to be able to set larger batch sizes on full refresh |
Thanks for the pointer, @graciegoheen ! I didn't realize modules were available in |
Great question @gtynan - we did some improvements to
Note: You'll need to set the behavior flag to flags:
state_modified_compare_more_unrendered_values: True Let me know if that works for you! |
I'm going to close out this issue as "not planned" - I'll also reach out to our docs team to see if it makes sense to add a callout to our docs page for the begin config about how to use the |
this pr adds a callout to the `begin` doc to inform users that they can use modules to set relative dates for `begin` config raised [internally](https://dbt-labs.slack.com/archives/C07NBMC7XPT/p1739584186963079) and in core issue dbt-labs/dbt-core#11270 (comment)
Thanks again @bdewilde - we updated our docs page for the begin config so that people have an example of how to do this :) |
That's perfect, thanks again @graciegoheen ! |
Is this your first time submitting a feature request?
Describe the feature
Currently, the
begin
config on microbatch incremental models is a fixed timestamp value that indicates the earliest point in time from which the data is needed or relevant. It's currently required, though there's declared interest in making it optional in the future. I'd like to propose a third case: specifyingbegin
as a relative time (e.g."INTERVAL '1 year'"
) whose value is computed dynamically when the model is run.This is useful because models are sometimes only relevant over a rolling window in time, specified by some sort of lookback (not to be confused with the batching config of the same name) relative to some reference time (typically "now"). In cases of a full refresh, it would be convenient to have the model start from the desired timestamp, rather than having to manually change the config every time.
Describe alternatives you've considered
The simplest alternative is to manually update the
begin
config before doing a full refresh for a microbatch incremental model. In my microbatch models, I've also added a condition in the query'sWHERE
clause that filters records by their event time column if they're less than a dynamically computed lookback timestamp, which is always more recent than the model's configured "begin". That works in the sense that the resulting data is what I want; however, iterating over lots of batches with zero rows is inefficient and seems a bit pointless. Finally, one could just use a different (not microbatch) incremental strategy, though this negates all the benefits of the new strategy.Who will this benefit?
Folks that have large time-based models that only need to be populated over a rolling window in time (I have many!), who'd like microbatch to make running these models even easier.
Are you interested in contributing this feature?
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: