extra flag to make catchup=False mean "first run is next scheduled" #45777
Replies: 5 comments
-
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
Beta Was this translation helpful? Give feedback.
-
Related: https://lists.apache.org/thread/j2n8h308ffq46sx0vcfnl61snh7tyjlo The current behavior is consistent with how data pipeline works. Generally speaking, this behavior is not so suitable for other patterns. The backfill pain will be resolved with AIP-78 Scheduler-managed backfill the rest will be handled in Airflow 3.1+ |
Beta Was this translation helpful? Give feedback.
-
As the thread and stale/closed PRs mention, there are many cases where the workaround with start_date is undesirable or even impossible. I'm not sure why neither of those PRs was accepted, seemed like a lot of the opposition was along the lines of "eh, I don't want to have to look at another argument in the config". Very glad I didn't waste time writing up a PR! |
Beta Was this translation helpful? Give feedback.
-
Yes. that's a very good reason in fact. Adding more confusion and options is not desireable "product" property. Sometimes even at the expense not handling all cases. You can have a product with million configurable parameters that is useless and far too generic. So "I do not want to have yet another knob to turn" is quite a good reason for not accepting it - from product point of view, even if individual cases are not happy. Generally it's impossible to make everyone happy, some people will still be somewhat unhappy. This looks like an important change in behaviour that also might impact some of the other discussions we have about Airlfow 3 - namely a lot of discussions about https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-83+amendment+to+support+classic+Airflow+authoring+style and resulting in https://cwiki.apache.org/confluence/display/AIRFLOW/Option+2+clarification+doc+WIP . While not 100% related, this PR and the backfill change mentioned by Elad are very much related to the catchup behaviour and show how you should approach such discussions. My suggestion is @seth-olsen if you feel very strongly about this one, start a discussion on devlist and put forth your arguments. Analyse all the past dicussions that @eladkal so helpfully provided, read them in detail, anylyse why thigns were rejected - try to understand other's arguments (even if you do not agree with them, trying to understand what others are saying is a good idea), and come up with a concrete proposal how you think your case should be addressed and justify it. Eventually everything we do is consensus driven (i.e. we want to get to consensus where generally we agree to a direction) but if we cannot reach consensus, the last resort is voting https://www.apache.org/foundation/voting.html Note that - similarly to what Daniel did, when presenting your proposal you shoudl consider all the cases and combinations in your proposal, what it means what consequences it has when introduced, what it means for backwards compatibility etc.. Just thorough thinking followed by discussion, reaching consensus and if not possible, defining the outcome and calling for a vote. Your case is way simpler than what Daniel discussed, but the mechanism is very similar. So I propose you start a |
Beta Was this translation helpful? Give feedback.
-
For now - I convert it to a discussion as this certainly not an issue or a simple feature that is clear whether or if we should follow it. |
Beta Was this translation helpful? Give feedback.
-
Description
Currently,
catchup=False
means "first run after turning on will be the latest scheduled run" and there is no way to produce the behavior where turning on a DAG means "make the first run after turning on occur at the next schedule time instead of the previous one".Proposal: make a new flag (call it something to the effect of
no_catchup_means_no_past_runs
) which when set toTrue
means that turning on a DAG that has the settingcatchup=False
will result in the first run being the one on the schedule that occurs next in time (rather than the one that occurs in the most recent past).Use case/motivation
The use case is whenever you want to turn on a DAG without it running right away.
Related issues
No response
Are you willing to submit a PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions