You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Observe the errors (init for Windows, inference for Linux)
On Windows (reformatted for clarity):
Error running ggml inference: Failed to load shared library 'H:\Applications\Nexa\venv\lib\site-packages\nexa\gguf\lib\llama\llama.dll':
Could not find module 'H:\Applications\Nexa\venv\lib\site-packages\nexa\gguf\lib\llama\llama.dll' (or one of its dependencies).
Try using the full path with constructor syntax.
On Linux:
Enter text to generate audio: Hello there!
⠋ bark_tokenize_input: prompt: 'Hello there!'
bark_tokenize_input: number of tokens in prompt = 513, first 8 tokens: 41226 21404 10154 129595 129595 129595 129595 129595
bark_forward_text_encoder: compute buffer size: 7.54 MB
ggml_backend_cuda_graph_compute: error: op not supported node_5 (SET)
GGML_ASSERT: /home/runner/work/nexa-sdk/nexa-sdk/dependency/bark.cpp/encodec.cpp/ggml/src/ggml-cuda.cu:7724: ok
Aborted (core dumped)
Issue Description
Bark inference appears to be broken with CUDA on both Windows and Linux. However, it works fine with CPU (manual override) and Vulkan.
Steps to Reproduce
pip install nexaai --extra-index-url https://github.nexa.ai/whl/cu124
nexa run bark-small:fp16
On Windows (reformatted for clarity):
On Linux:
Workaround for CPU inference on Linux (slow):
Workaround for GPU inference on Windows:
pip install nexaai --extra-index-url https://github.nexa.ai/whl/vulkan
In both cases, however, there is background noise in the generated file. A sample can be reviewed here.
OS
Ubuntu 22.04; Windows 11 Pro
Python Version
3.10
Nexa SDK Version
0.0.9.9
GPU (if using one)
RTX 3060
The text was updated successfully, but these errors were encountered: