Activity
add correct MFU - fwd only
add correct MFU - fwd only
remove mfu as its likely wrong
remove mfu as its likely wrong
remove cast_dtype_for_dot as it slows down a lot; fixes for a100; add…
remove cast_dtype_for_dot as it slows down a lot; fixes for a100; add…
fp32 precision pass all tests except gpt test
fp32 precision pass all tests except gpt test
not perfect match but close
not perfect match but close