Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pipeline parallel support to TransformersModel #12832

Merged
merged 38 commits into from
Mar 25, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
04a93de
Add pipeline parallel support to `TransformersModel`
hmellor Feb 6, 2025
099b3e0
Simplify weight loading
hmellor Feb 6, 2025
8b1ea1d
Only allocate tensors for current pipeline stage
hmellor Feb 6, 2025
ec626e8
Don't set buffers to empty right after initialising them
hmellor Feb 6, 2025
8eafc6d
Update to use `_pp_plan` and `_tp_plan`
hmellor Feb 7, 2025
87c395c
Respond to comment
hmellor Feb 7, 2025
3498758
Add docstring to `init_buffers`
hmellor Feb 7, 2025
e652d50
Update some comments
hmellor Feb 7, 2025
e756f27
Merge branch 'main' into pipeline-parallel
hmellor Feb 12, 2025
95cdfcf
Rename inner tensor parallel method
hmellor Feb 13, 2025
14e84e8
Add test
hmellor Feb 13, 2025
ce42695
Merge branch 'main' into pipeline-parallel
hmellor Feb 13, 2025
a591f44
Use fallback model for tests instead
hmellor Feb 14, 2025
fd0445a
Resintate weight renaming in `load_weights`
hmellor Feb 14, 2025
a1ec6a2
Merge branch 'main' into pipeline-parallel
hmellor Feb 14, 2025
3b56faf
Add comment to if block in model loading
hmellor Feb 14, 2025
58257b7
Add better error message when custom model is used without `--trust-r…
hmellor Feb 14, 2025
bb11e9d
fix transformers dynamic module resolve with mp
Isotr0py Feb 17, 2025
0cac219
better annotation
Isotr0py Feb 17, 2025
694ce2b
cleanup
Isotr0py Feb 17, 2025
9263a11
add comment back
Isotr0py Feb 17, 2025
6e446a5
Merge branch 'main' into pipeline-parallel
hmellor Feb 17, 2025
9c27108
Merge branch 'fix-transformers-tp' into pipeline-parallel
hmellor Feb 17, 2025
11b1626
Add comment to test so it doesn't get removed
hmellor Feb 17, 2025
ae3f42d
Update Transformers pin
hmellor Feb 17, 2025
dae0554
Merge branch 'main' into pipeline-parallel
hmellor Feb 18, 2025
e620939
Fix `create_attention_instances` args
hmellor Feb 18, 2025
766d489
Merge branch 'main' into pipeline-parallel
hmellor Feb 18, 2025
46db9b5
Merge branch 'main' into pipeline-parallel
hmellor Feb 19, 2025
b0c20eb
Merge branch 'main' into pipeline-parallel
hmellor Feb 20, 2025
4f414be
Merge branch 'main' into pipeline-parallel
hmellor Feb 21, 2025
9f60911
Merge branch 'main' into pipeline-parallel
hmellor Feb 24, 2025
3e476ee
Merge branch 'main' into pipeline-parallel
hmellor Feb 26, 2025
54f9928
Merge branch 'main' into pipeline-parallel
hmellor Mar 19, 2025
3ef3fe2
Stop passing `intermediate_tensors` to the HF model
hmellor Mar 21, 2025
cadfbd2
Revert requirements changes
hmellor Mar 24, 2025
b84a845
Disable test
hmellor Mar 24, 2025
0216271
Update docs
hmellor Mar 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/models/supported_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ The Transformers fallback explicitly supports the following features:

- <project:#quantization-index> (except GGUF)
- <project:#lora-adapter>
- <project:#distributed-serving> (pipeline parallel coming soon <gh-pr:12832>!)
- <project:#distributed-serving> (requires `transformers>=4.49.0`)

#### Remote code

Expand Down
2 changes: 1 addition & 1 deletion requirements/test.in
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ matplotlib # required for qwen-vl test
mistral_common[opencv] >= 1.5.4 # required for pixtral test
datamodel_code_generator # required for minicpm3 test
lm-eval[api]==0.4.4 # required for model evaluation test
transformers==4.48.2
transformers==4.48.2
# quantization
bitsandbytes>=0.45.3
buildkite-test-collector==0.1.9
Expand Down
3 changes: 3 additions & 0 deletions tests/distributed/test_pipeline_parallel.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,8 @@ def iter_params(self, model_id: str):
"inceptionai/jais-13b-chat": PPTestSettings.fast(),
"ai21labs/Jamba-tiny-dev": PPTestSettings.fast(),
"meta-llama/Llama-3.2-1B-Instruct": PPTestSettings.detailed(),
# Tests TransformersModel
"ArthurZ/Ilama-3.2-1B": PPTestSettings.fast(),
"openbmb/MiniCPM-2B-sft-bf16": PPTestSettings.fast(),
"openbmb/MiniCPM3-4B": PPTestSettings.fast(),
# Uses Llama
Expand Down Expand Up @@ -243,6 +245,7 @@ def iter_params(self, model_id: str):
# [LANGUAGE GENERATION]
"microsoft/Phi-3.5-MoE-instruct",
"meta-llama/Llama-3.2-1B-Instruct",
# "ArthurZ/Ilama-3.2-1B", NOTE: Uncomment after #13905
"ibm/PowerLM-3b",
# [LANGUAGE EMBEDDING]
"intfloat/e5-mistral-7b-instruct",
Expand Down
Loading