Rewrite execution of microbatch models to avoid blocking the main thread #11332

QMalcolm · 2025-02-24T20:29:17Z

Resolves #11243
Resolves #11306

Problem

There are two problems

Executing microbatch model batches concurrently would block the main thread from scheduling other nodes as described in [Bug] Microbatch models shouldn't block the main thread in multi-threaded dbt runs. #11243
In certain scenarios, a microbatch model would hang indefinitely as described in [Bug] Microbatch can cause dbt runs to hang #11306

Solution

Checklist

I have read the contributing guide and understand what's expected of me.
I have run this code in development, and it appears to resolve the stated issue.
This PR includes tests, or tests are not required or relevant for this PR.
This PR has no interface changes (e.g., macros, CLI, logs, JSON artifacts, config files, adapter interface, etc.) or this PR has already received feedback and approval from Product or DX.
This PR includes type annotations for new and modified functions.

…stration to a runner We're working to ensure the orchestration of microbatch batches doesn't block the main thread. This will require a lot of disentangling that currently exists in run.py. As such, it made sense to "quickly" stub out a guide of what needs to be done.

The `MicrobatchBatchRunner` will be for running individual batches, whereas the `MicrobatchModelRunner` will handle the orchestration of the batches to be run for a given model.

…Runner` directly Previously `handle_job_queue` considered `MicrobatchModelRunner` special cases, and delegated to `handle_microbatch_model` to orchestrate the batches instead of delegating to the `MicrobatchModelRunner` directly. Now that the `MicrobatchModelRunner` will be handling batch orchestration, we can appropriately delegate to it, and remove the special casing.

The function won't work as is, but I felt it better to straight copy, commit, and then modify it to work in the runner context iteratively.

…nner`

codecov · 2025-02-24T20:35:33Z

Codecov Report

Attention: Patch coverage is 89.55224% with 21 lines in your changes missing coverage. Please review.

Project coverage is 88.86%. Comparing base (f7c4c3c) to head (ef461da).
Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #11332      +/-   ##
==========================================
- Coverage   88.97%   88.86%   -0.12%     
==========================================
  Files         189      190       +1     
  Lines       24182    24197      +15     
==========================================
- Hits        21517    21503      -14     
- Misses       2665     2694      +29

Flag	Coverage Δ
integration	`86.07% <87.06%> (-0.22%)`	⬇️
unit	`62.59% <31.34%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
Unit Tests	`62.59% <31.34%> (+0.03%)`	⬆️
Integration Tests	`86.07% <87.06%> (-0.22%)`	⬇️

We don't need these functions in `MicrobatchModelRunner` because the inherited versions of these methods from `ModelRunner` will work for our needs. Of note, we can probably also remove the need of having these functions in `MicrobatchBatchRunner` by renaming the `print_batch_start_line` and `print_batch_result_line` to the method names that the `ModelRunner` methods call.

…unner`

…unner` The `MicrobatchModelRunner.compile` does nothing because `MicrobatchModelRunner` only orchestrates the batches of the model to run, and doesn't actually run the sql of the model. Thus compilation is unnecessary in `MicrobatchModelRunner`

Of note, implementing `on_skip` for `MicrobatchModelRunner` is unecessary because the inherited `on_skip` suffices.

Previously `build_jinja_context_batch` was an instance specific method of `MicrobatchBuilder`. An issue with this is that with the now existant split of `MicrobatchModelRunner` and `MicrobatchBatchRunner` we'd either need to pass the `MicrobatchBuilder` from the `MicrobatchModelRunner` to the `MicrobatchBatchRunner`, or instantiate a new `MicrobatchBuilder` in every `MicrobatchBatchRunner`. The issue with the former is that the passed in `MicrobatchBuilder` wouldn't have the `compiled_code` on the `model`. We could instead do the latter option, but instantiating a new but that seems unnecessary, when the method can easily become a static method.

…rially The orchestration of batches being moved onto a runner, the `MicrobatchModelRunner`, sending a `KeyboardInterrupt` to the process no longer stopped things. This is because we previously relied on closing all active adapter contections to stop currently being executed tasks. However, the `MicrobatchModelRunner` doesn't have any active data warehouse connections itself, as adapter conections for batches are opened by the `MicrobatchBatchRunner`. Because of this, the closing of connections would cancel a running batch, but then the next batch would be submitted (and open a new connection). To stop this from happening, we needed a way to stop new batches from being submitted. To do this, we created a new `DbtThreadPool` which tracks whether or not it's been closed. If it's closed, then `_submit_batch` skips the batch entirely. NOTE: This only works if the batches are running serially. It does not work if the batches are being run concurrently as the orchestrator submits all of the batches immediately. Thus checking on `_submit_batch` is ineffective. We'll address this in the next commit.

…rently when interrupted In the previous commit we made it such that microbatch model execution could be halted when batches were being executed serially. However, that work did not make it such that the microbatch model execution would shut down when executing batches concurrently. This change, fixes that issue. Additionally we deleted a test. Unfortunately it is no longer reliably possible to test KeyboardInterrupts of microbatch models as we don't have a way to fire a keyboard interrupt at the right time consistently in our testing environment. The test that existed would hang indefinitely, as a keyboard interrupt was being raised on a thread that was not the main thread (which is impossible in the real world, as keyboard interrupts are always fired from the mian thread).

…execute` The lines in for tracking/printing at the end of `MicrobatchmodelRunner.execute` are not necessary because the `after_execute` inherited from `ModelRunner` does both of these things. Thus the lines at the end of `MicrobatchModelRunner.execute` were duplicative.

The `MicrobatchBatchRunner` never uses `describe_node` as it instead uses `describe_batch`. Thus, `describe_node` serves no purpose.

…ementation

Removing this special logic is safe, and the test `TestMicrobatchModelSkipped` confirms this.

QMalcolm · 2025-03-03T19:27:32Z

core/dbt/graph/thread_pool.py

+from multiprocessing.pool import ThreadPool
+
+
+class DbtThreadPool(ThreadPool):


We created DbtThreadPool so that we can have visibility on whether .close() has been called on the pool. This class is now used instead of ThreadPool.

core/dbt/task/run.py

…icrobatchModelRunner`

github-actions · 2025-03-03T21:21:58Z

The backport to 1.9.latest failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-1.9.latest 1.9.latest
# Navigate to the new working tree
cd .worktrees/backport-1.9.latest
# Create a new branch
git switch --create backport-11332-to-1.9.latest
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 94b6ae13b3c9bf1ae231d0bdc4b81c9d8cf712c0
# Push it to GitHub
git push --set-upstream origin backport-11332-to-1.9.latest
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-1.9.latest

Then, create a pull request where the base branch is 1.9.latest and the compare/head branch is backport-11332-to-1.9.latest.

…ead (#11332) * Push orchestration of batches previously in the `RunTask` into `MicrobatchModelRunner` * Split `MicrobatchModelRunner` into two separate runners `MicrobatchModelRunner` is now an orchestrator of `MicrobatchBatchRunner`s, the latter being what handle actual batch execution * Introduce new `DbtThreadPool` that knows if it's been closed * Enable `MicrobatchModelRunner` to shutdown gracefully when it detects the thread pool has been closed

…ead (#11332) (#11349) * Push orchestration of batches previously in the `RunTask` into `MicrobatchModelRunner` * Split `MicrobatchModelRunner` into two separate runners `MicrobatchModelRunner` is now an orchestrator of `MicrobatchBatchRunner`s, the latter being what handle actual batch execution * Introduce new `DbtThreadPool` that knows if it's been closed * Enable `MicrobatchModelRunner` to shutdown gracefully when it detects the thread pool has been closed Co-authored-by: Michelle Ark <[email protected]>

QMalcolm added 14 commits February 24, 2025 14:40

Stub out new MicrobatchModelRunner and MicrobatchBatchRunner.

90d53c2

The `MicrobatchBatchRunner` will be for running individual batches, whereas the `MicrobatchModelRunner` will handle the orchestration of the batches to be run for a given model.

Straight copy handle_microbatch_model into MicrobatchModelRunner

85b4f23

The function won't work as is, but I felt it better to straight copy, commit, and then modify it to work in the runner context iteratively.

Convert handle_microbatch_model into runner execute type method

d23095d

Begin getting batches in MicrobatchModelRunner

5279140

Begin getting _has_relation in execute method of `MicrobatchModelRu…

e45c2b1

…nner`

Begin initializing empty RunResult for MicrobatchModelRunner

c4d1b1c

Make RunTask and ThreadPool available in MicrobatchModelRunner

374a9bd

Make merge_batch_results to MicrobatchModelRunner

a2331df

Begin instantiating MicrobatchBatchRunner during _submit_batch

ce9333c

Move should_run_in_parallel to MicrobatchBatchRunner

8397264

Move describe_batch to MicrobatchBatchRunner

046e08a

Move batch print eventing methods to MicrobatchBatchRunner

86a8439

QMalcolm added the Skip Changelog Skips GHA to check for changelog file label Feb 24, 2025

cla-bot bot added the cla:yes label Feb 24, 2025

dbt-labs deleted a comment from github-actions bot Feb 24, 2025

QMalcolm added 11 commits February 24, 2025 15:45

Move describe_node to MicrobatchBatchRunner and `MicrobatchModelR…

4bc33fb

…unner`

Implement on_skip for MicrobatchBatchRunner

1a8fbf1

Of note, implementing `on_skip` for `MicrobatchModelRunner` is unecessary because the inherited `on_skip` suffices.

Move batch execution logic from old runner into MicrobatchBatchRunner

590fbb0

Ensure first batch is run as full refresh when relevant

5c5558e

Update BuildTask to use MicrobatchModelRunner directly

e60ceb6

Fix mocking of should_run_in_parallel in microbatch tests

ce0d4c1

Fix unit tests for build_jinja_context_for_batch

7ec1ce5

Fix unit tests for test_should_run_in_parallel

35bc7ce

QMalcolm force-pushed the qmalcolm--11243-stop-microbatch-from-blocking-main-thread branch from 3e857dd to 35bc7ce Compare February 26, 2025 19:28

QMalcolm added 9 commits March 2, 2025 20:34

Remove describe_node method from MicrobatchBatchRunner

1ba67a9

The `MicrobatchBatchRunner` never uses `describe_node` as it instead uses `describe_batch`. Thus, `describe_node` serves no purpose.

Rename print line methods in MicrobatchBatchRunner to simplify impl…

20a1446

…ementation

Abstract initialization of MicrobatchBuilder into utility method

c81010e

Remove special skip logic in MicrobatchModelRunner.execute

c8b78f1

Removing this special logic is safe, and the test `TestMicrobatchModelSkipped` confirms this.

Remove TODO statement which is no longer relevant

303cdf7

Add changie doc

486f351

QMalcolm marked this pull request as ready for review March 3, 2025 19:19

QMalcolm requested a review from a team as a code owner March 3, 2025 19:19

QMalcolm removed the Skip Changelog Skips GHA to check for changelog file label Mar 3, 2025

QMalcolm commented Mar 3, 2025

View reviewed changes

MichelleArk reviewed Mar 3, 2025

View reviewed changes

core/dbt/task/run.py Outdated Show resolved Hide resolved

MichelleArk approved these changes Mar 3, 2025

View reviewed changes

QMalcolm added 2 commits March 3, 2025 14:26

Use setters for parent_task and pool of MicrobatchModelRunner

a32a7ae

Add comments around the necessity of _parent_task and _pool on `M…

ef461da

…icrobatchModelRunner`

QMalcolm merged commit 94b6ae1 into main Mar 3, 2025
55 of 57 checks passed

QMalcolm deleted the qmalcolm--11243-stop-microbatch-from-blocking-main-thread branch March 3, 2025 21:21

QMalcolm added the backport 1.9.latest label Mar 3, 2025

QMalcolm mentioned this pull request Mar 3, 2025

Backport 11332 to 1.9.latest #11349

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite execution of microbatch models to avoid blocking the main thread #11332

Rewrite execution of microbatch models to avoid blocking the main thread #11332

QMalcolm commented Feb 24, 2025 •

edited

Loading

codecov bot commented Feb 24, 2025 •

edited

Loading

QMalcolm Mar 3, 2025

github-actions bot commented Mar 3, 2025

		from multiprocessing.pool import ThreadPool


		class DbtThreadPool(ThreadPool):

Rewrite execution of microbatch models to avoid blocking the main thread #11332

Rewrite execution of microbatch models to avoid blocking the main thread #11332

Conversation

QMalcolm commented Feb 24, 2025 • edited Loading

Problem

Solution

Checklist

codecov bot commented Feb 24, 2025 • edited Loading

Codecov Report

QMalcolm Mar 3, 2025

Choose a reason for hiding this comment

github-actions bot commented Mar 3, 2025

QMalcolm commented Feb 24, 2025 •

edited

Loading

codecov bot commented Feb 24, 2025 •

edited

Loading