Replies: 1 comment 1 reply
-
I've created an issue for this here: #15401 We are currently focused on improving our overall settings experience, so this fits nicely as something we want to consider! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I have a flow that orchestrates several subflows. My subflows have tasks that each return large dataset objects (AnnData objects) which can be several GB large.
I want to persist results from these subflows, so I enabled
PREFECT_RESULTS_PERSIST_BY_DEFAULT=true
. However, I was surprised by this blurb about enabling persistence:I discovered that, when enabled by the env variable, my flows started running out of memory when the tasks tried to persist these large objects to S3. I had expected that this option only affects flows. Now I can definitely see it being useful for tasks, but I do not want it automatically enabled for all my tasks across all my subflows because they need to return these large objects.
I still want to persist the state of my subflows so that I can retry and pick up the parent flow from the failed subflow instead of the beginning. Rather than going through all the tasks in my flows and explicitly disabling this option, it would be great if there were instead additional settings to control or override the behavior for tasks versus flows. Would it be possible to add:
To further control the behavior via the environment or config?
Or perhaps I am overlooking something: Is it possible to set only the parent flow to persist results of the subflow deployments, and within the subflow deployments themselves not have it enabled?
My parent flow code resembles:
Beta Was this translation helpful? Give feedback.
All reactions