[LLVMGPUVectorDistribute] VectorDistribution support for unaligned shapes #20144

Groverkss · 2025-03-03T15:57:37Z

This PR adds support to perform statically tiled codegen on dynamic shapes in vector distribute pipeline.
Basically, it could honor lowering configs on dynamic shapes using masking.

Some side-effect changes:

Currently block dynamic dimension pass, change the dimensionality of the generics without performing a projection of the lowering config that was provided higher up in the pipeline. Moreover, the requirement to do this becomes less now as we can tile generally on the dynamic dimension with the changes here -- unless Im missing something here.

This builds on the following PRs -- hence putting to draft:

future work:

(near future) Enable intrinsic distribution with selects.
(near future) Once [MLIR][Vector] Add support for inner-parallel masked multi-reductions llvm/llvm-project#126722 is merged and integrated, we can enable masking post-distribution.
[AMDGPU] once buffer descriptor based pointers are properly introduced, we could rewrite the masked loads to be just loads followed by selects. (This works with the prototype : [Codegen][ROCDL] Use buffer fat pointers where possible #19918)

Original Author: @manupak

Also, keeping it disabled by default until lowering config projection is fixed. * enable masking in generic vectorization * add two runs of resolve type to fold tensor.dim in rank reducing type. Signed-off-by: Manupa Karunaratne <[email protected]>

masked compute. Signed-off-by: Manupa Karunaratne <[email protected]>

masked cases. Signed-off-by: Manupa Karunaratne <[email protected]>

* only enable masking in vectorization in vector distribute Signed-off-by: Manupa Karunaratne <[email protected]>

PR. Signed-off-by: Manupa Karunaratne <[email protected]>

and add code not to run on ops where lowering config is set. Signed-off-by: Manupa Karunaratne <[email protected]>

Signed-off-by: Manupa Karunaratne <[email protected]>

AmosLewis · 2025-03-10T16:45:07Z

Just test llama3 benchmark on input size length 128/2048 by locally rebased this PR on 3.3.0rc20250310. Improve performance around 1ms.
128: 37.2 ms->36.4 ms.
2048: 174 ms -> 173 ms.

Groverkss · 2025-03-10T16:46:26Z

Just test llama3 benchmark on input size length 128/2048 by locally rebased this PR on 3.3.0rc20250310. Improve performance around 1ms. 128: 37.2 ms->36.4 ms. 2048: 174 ms -> 173 ms.

This should have no effect on any benchmarks...

manupak and others added 11 commits February 26, 2025 02:41

Add CLI flag to enable distributed

b81b402

masked compute. Signed-off-by: Manupa Karunaratne <[email protected]>

add vector distribute pipeline tests for

76ccf17

masked cases. Signed-off-by: Manupa Karunaratne <[email protected]>

* enable block dynamic attention in the unit test

fefcfd1

* only enable masking in vectorization in vector distribute Signed-off-by: Manupa Karunaratne <[email protected]>

fix tests after fixing mask creation in the bottom most

8646e71

PR. Signed-off-by: Manupa Karunaratne <[email protected]>

Remove cli flag block dynamic dims for attention

876a774

and add code not to run on ops where lowering config is set. Signed-off-by: Manupa Karunaratne <[email protected]>

cleanup

526cfc1

Signed-off-by: Manupa Karunaratne <[email protected]>

* cleanup tests

f373b7a

Signed-off-by: Manupa Karunaratne <[email protected]>

Remove blockdynamicdimension changes

50b0bef

Move examples around

cf8cd97

Fix

5f3989c

Groverkss requested review from MaheshRavishankar, qedawkins, kuhar and antiagainst as code owners March 3, 2025 15:57

Groverkss marked this pull request as draft March 3, 2025 15:58

Groverkss mentioned this pull request Mar 3, 2025

[LLVMGPUVectorDistribute] Add general support for statically tiled codegen on dynamic shapes #19992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLVMGPUVectorDistribute] VectorDistribution support for unaligned shapes #20144

[LLVMGPUVectorDistribute] VectorDistribution support for unaligned shapes #20144

Groverkss commented Mar 3, 2025 •

edited

Loading

AmosLewis commented Mar 10, 2025

Groverkss commented Mar 10, 2025 •

edited

Loading

[LLVMGPUVectorDistribute] VectorDistribution support for unaligned shapes #20144

Are you sure you want to change the base?

[LLVMGPUVectorDistribute] VectorDistribution support for unaligned shapes #20144

Conversation

Groverkss commented Mar 3, 2025 • edited Loading

AmosLewis commented Mar 10, 2025

Groverkss commented Mar 10, 2025 • edited Loading

Groverkss commented Mar 3, 2025 •

edited

Loading

Groverkss commented Mar 10, 2025 •

edited

Loading