-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WASM32-WASI NN support #2520
Comments
Yes, I think it'd be useful. So basically it would mean to implement a new backend that can use wasi-nn apis and have the code compiled to wasm32? Linking wasi-nn https://github.com/WebAssembly/wasi-nn/tree/main |
Yes you should be able to add the target wasm32-wasip1/2 and get a wasm which you can run for instance on wasmedge |
What are the benefits over building the ndarray backend or the wgpu backend for wasm? Just trying to get the benefits of targeting |
Great question. Wasm32-unknown-unknown is mainly used inside browsers where access to file systems is very restricted. Wasi allows the use of WIT, standardised interfaces. Some are for IO, key-value, cryptography and one is especially interesting for AI, the wasi-nn [neural networks]. This interface allows access to ML accelerators, e.g. OpenVINO, CUDA,... which can make the code run exponentially faster. WASI is made for device and cloud deployments where the unknown one is mainly for browsers. Hopefully that makes sense now. You might want to support wasip2 directly although nn is still pending. |
We're unlikely to create another backend using an external execution engine, as most of our efforts are focused on developing our own optimizations, kernels, and compiler tools. If there is an API to support GPU execution (SPIRV, CUDA, Metal, etc.) then we could target those in CubeCL. |
It would worth coming back to this when |
I think you might be under-estimating the potential of One of the most important parts of the deploy-to-wasi-platform story is that Even now, platforms like Fermyon's Spin (cloud platform & the SpinKube K8s plugin), WasmCloud, WasmEdge, etc. offer much more attractive deployment targets for latency-sensitive applications like recommendations & personalization systems for eCommerce, which obviously has huge market potential. LLM inference could also benefit from the latency & cost reductions that So, please keep a close eye on the development of |
Feature description
There are several requests for WASM (webworkers,...). Please add WASM32-WASI with NN WIT support
Feature motivation
Solutions like WASMEdge allow for LLM inference (https://wasmedge.org/docs/develop/rust/wasinn/llm_inference/), so it would be excellent if any Burn LLM model could be compiled into a WASM32-WASI with NN WIT support and ran in serverless, edge and other constrained environments.
(Optional) Suggest a Solution
A wasi-nn feature allows a cargo build to target wasm32-wasip2 and NN WIT
The text was updated successfully, but these errors were encountered: