We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, I believe I found a potential bug in the _attn_fwd_tma function. Specifically, it concerns how the offset_y variable is computed.
Here is the relevant part of the code:
start_m = tl.program_id(0) off_hz = tl.program_id(1) off_z = off_hz // H off_h = off_hz % H offset_y = off_z + off_h * N_CTX
The calculation of offset_y is a bit confusing. Based on the current implementation, it is computed as:
offset_y = off_z + off_h * N_CTX
However, I believe it should be:
offset_y = off_hz * N_CTX
Could you clarify if this is a bug or explain the reasoning behind the current implementation? Thanks for your help!
The latest version of Triton
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
Hi, I believe I found a potential bug in the _attn_fwd_tma function. Specifically, it concerns how the offset_y variable is computed.
Here is the relevant part of the code:
The calculation of offset_y is a bit confusing. Based on the current implementation, it is computed as:
However, I believe it should be:
Could you clarify if this is a bug or explain the reasoning behind the current implementation? Thanks for your help!
Environment details
The latest version of Triton
The text was updated successfully, but these errors were encountered: