Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brgemm register tiling for bf16 type #1005

Merged
merged 17 commits into from
Feb 19, 2025

Conversation

arun-thmn
Copy link
Contributor

@arun-thmn arun-thmn commented Feb 3, 2025

This PR extends the brgemm register tiling pass to support bf16 type. The changes:

  1. Template the existing pass to execute on linalg.batch_reduce_matmul for fp32 and linal.generic for vnni opt bf16,
  2. Test-cases for bf16 type.

@arun-thmn arun-thmn added the benchmark-full Benchmark all targets label Feb 3, 2025
@arun-thmn arun-thmn marked this pull request as ready for review February 3, 2025 03:38
@arun-thmn
Copy link
Contributor Author

@rengolin Request to review this PR for bf16 register tile support. I have re-written the tiling pass with new logic (template and more checks) to tile both fp32 and f16 (vnni). If you have time, I request you to review it as a new pass (as the existing tiling for fp32, I did it immediately joining Intel with lesser understanding of concepts).

@arun-thmn arun-thmn added benchmark-full Benchmark all targets and removed benchmark-full Benchmark all targets labels Feb 3, 2025
Copy link
Contributor

@rengolin rengolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to iterate more on the upstream story later, but if this shows some results, then I'm happy to merge this as soon as @adam-smnk and @rolfmorel are happy with it.

@arun-thmn
Copy link
Contributor Author

arun-thmn commented Feb 19, 2025

@adam-smnk and @rolfmorel, I have updated the PR. So now with the help of maps and iterator type we choose the tile sizes and tile interchange options. Please have a look.
Also, this PR now can support tensor type as well. I'm plan to do it in a new PR. I need to add few more conditions + alter existing validations.

Copy link
Contributor

@adam-smnk adam-smnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Just a few minor comments.

Copy link
Contributor

@adam-smnk adam-smnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍
Thanks for bearing with all the iterations

@arun-thmn
Copy link
Contributor Author

arun-thmn commented Feb 19, 2025

Thanks @adam-smnk and @rolfmorel , it's a good learning for me through your comments. Never did this perfect shaping during college days.

@arun-thmn arun-thmn merged commit f8d8a16 into libxsmm:main Feb 19, 2025
14 checks passed
arun-thmn added a commit that referenced this pull request Feb 20, 2025
This patch extends PR: #1005 to
extend register tiling support for `tensor` type. Also, adds new unit
test-cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark-full Benchmark all targets
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants