You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Are you willing to contribute it (Yes/No): Maybe :)
Describe the feature and the current behavior/state.
We are looking into using WebGPU backend for inference and see a decent improvement (~5-10%) over WebGL for our models, but it is much lower than our expectation.
One potential way to speed up inference would be to use fp16 instead of fp32 data type for tensors. The WebGL backend already supports fp16 which we use. WebGPU also supports fp16, atleast on Chrome desktop (https://chromestatus.com/feature/5180552617656320)
Ideally we would like to use F32_F16 precision as defined in tflite to get best tradeoff between precision loss and performance.
Will this change the current api? How?
An environment flag to set precision (similar to WebGL) would be ideal for ease of integration.
Who will benefit with this feature?
All consumers of WebGPU backend.
The text was updated successfully, but these errors were encountered:
System information
Describe the feature and the current behavior/state.
Will this change the current api? How?
Who will benefit with this feature?
The text was updated successfully, but these errors were encountered: