[BUG] Bark.cpp inference broken with CUDA #386

iwr-redmond · 2025-02-11T23:18:21Z

Issue Description

Bark inference appears to be broken with CUDA on both Windows and Linux. However, it works fine with CPU (manual override) and Vulkan.

Steps to Reproduce

pip install nexaai --extra-index-url https://github.nexa.ai/whl/cu124
nexa run bark-small:fp16
Observe the errors (init for Windows, inference for Linux)

On Windows (reformatted for clarity):

Error running ggml inference: Failed to load shared library 'H:\Applications\Nexa\venv\lib\site-packages\nexa\gguf\lib\llama\llama.dll':
Could not find module 'H:\Applications\Nexa\venv\lib\site-packages\nexa\gguf\lib\llama\llama.dll' (or one of its dependencies).
Try using the full path with constructor syntax.

On Linux:

Enter text to generate audio: Hello there!
⠋ bark_tokenize_input: prompt: 'Hello there!'
bark_tokenize_input: number of tokens in prompt = 513, first 8 tokens: 41226 21404 10154 129595 129595 129595 129595 129595 

bark_forward_text_encoder: compute buffer size: 7.54 MB

ggml_backend_cuda_graph_compute: error: op not supported node_5 (SET)
GGML_ASSERT: /home/runner/work/nexa-sdk/nexa-sdk/dependency/bark.cpp/encodec.cpp/ggml/src/ggml-cuda.cu:7724: ok
Aborted (core dumped)

Workaround for CPU inference on Linux (slow):

Open nexa/gguf/nexa_inference_tts.py#L52
Force CPU mode:

-self.device = device
+self.device = "cpu"

Workaround for GPU inference on Windows:

Remove the venv containing nexaai (uninstalling nexaai itself is insufficient)
Install the Vulkan build: pip install nexaai --extra-index-url https://github.nexa.ai/whl/vulkan

In both cases, however, there is background noise in the generated file. A sample can be reviewed here.

OS

Ubuntu 22.04; Windows 11 Pro

Python Version

3.10

Nexa SDK Version

0.0.9.9

GPU (if using one)

RTX 3060

The text was updated successfully, but these errors were encountered:

iwr-redmond added the 🐞 bug Something isn't working label Feb 11, 2025

iwr-redmond mentioned this issue Feb 11, 2025

[FEATURE] Publish Vulkan & Metal binary wheels to PyPI #380

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Bark.cpp inference broken with CUDA #386

[BUG] Bark.cpp inference broken with CUDA #386

iwr-redmond commented Feb 11, 2025

[BUG] Bark.cpp inference broken with CUDA #386

[BUG] Bark.cpp inference broken with CUDA #386

Comments

iwr-redmond commented Feb 11, 2025

Issue Description

Steps to Reproduce

OS

Python Version

Nexa SDK Version

GPU (if using one)