-
Notifications
You must be signed in to change notification settings - Fork 679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVMGPUVectorDistribute] VectorDistribution support for unaligned shapes #20144
base: main
Are you sure you want to change the base?
Conversation
Also, keeping it disabled by default until lowering config projection is fixed. * enable masking in generic vectorization * add two runs of resolve type to fold tensor.dim in rank reducing type. Signed-off-by: Manupa Karunaratne <[email protected]>
masked compute. Signed-off-by: Manupa Karunaratne <[email protected]>
masked cases. Signed-off-by: Manupa Karunaratne <[email protected]>
* only enable masking in vectorization in vector distribute Signed-off-by: Manupa Karunaratne <[email protected]>
PR. Signed-off-by: Manupa Karunaratne <[email protected]>
and add code not to run on ops where lowering config is set. Signed-off-by: Manupa Karunaratne <[email protected]>
Signed-off-by: Manupa Karunaratne <[email protected]>
Signed-off-by: Manupa Karunaratne <[email protected]>
Just test llama3 benchmark on input size length 128/2048 by locally rebased this PR on 3.3.0rc20250310. Improve performance around 1ms. |
This should have no effect on any benchmarks... |
This PR adds support to perform statically tiled codegen on dynamic shapes in vector distribute pipeline.
Basically, it could honor lowering configs on dynamic shapes using masking.
Some side-effect changes:
This builds on the following PRs -- hence putting to draft:
future work:
Original Author: @manupak