You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your response. This issue has been resolved.
However, I noticed that the implementation of NKI matrix multiplication is still slower than torch.matmul(A.T, B).
The related discussion link is as follows.
Describe the bug
When I run the codes in contributed/matmul.py, I encounter the following error:
TypeError: Cannot update immutable parameter Z_DRAM
This is due to using Z as a parameter to matmul, which is decorated with nki.baremetal().
Is there any way to fix this error?
Expected Behavior
The Optimized matrix multiplication kernel works correctly.
Current Behavior
Cannot update immutable parameter Z_DRAM
when invoke
nl.store(Z_DRAM[m_start:m_end, n_start:n_end], value=Z_SBUF[m1])
Reproduction Steps
python contributed/matmul.py
Regression Issue
Possible Solution
No response
Additional Information/Context
No response
neuronx-cc version used
NeuronX Compiler version 2.16.372.0+4a9b2326
Framework(s) and their versions used (JAX, PyTorch, etc..)
Python version 3.10.12 HWM version 2.16.0.372+4a9b2326 NumPy version 1.25.2
The text was updated successfully, but these errors were encountered: