Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Bark.cpp inference broken with CUDA #386

Open
iwr-redmond opened this issue Feb 11, 2025 · 0 comments
Open

[BUG] Bark.cpp inference broken with CUDA #386

iwr-redmond opened this issue Feb 11, 2025 · 0 comments
Labels
🐞 bug Something isn't working

Comments

@iwr-redmond
Copy link

Issue Description

Bark inference appears to be broken with CUDA on both Windows and Linux. However, it works fine with CPU (manual override) and Vulkan.

Steps to Reproduce

  1. pip install nexaai --extra-index-url https://github.nexa.ai/whl/cu124
  2. nexa run bark-small:fp16
  3. Observe the errors (init for Windows, inference for Linux)

On Windows (reformatted for clarity):

Error running ggml inference: Failed to load shared library 'H:\Applications\Nexa\venv\lib\site-packages\nexa\gguf\lib\llama\llama.dll':
Could not find module 'H:\Applications\Nexa\venv\lib\site-packages\nexa\gguf\lib\llama\llama.dll' (or one of its dependencies).
Try using the full path with constructor syntax.

On Linux:

Enter text to generate audio: Hello there!
⠋ bark_tokenize_input: prompt: 'Hello there!'
bark_tokenize_input: number of tokens in prompt = 513, first 8 tokens: 41226 21404 10154 129595 129595 129595 129595 129595 

bark_forward_text_encoder: compute buffer size: 7.54 MB

ggml_backend_cuda_graph_compute: error: op not supported node_5 (SET)
GGML_ASSERT: /home/runner/work/nexa-sdk/nexa-sdk/dependency/bark.cpp/encodec.cpp/ggml/src/ggml-cuda.cu:7724: ok
Aborted (core dumped)

Workaround for CPU inference on Linux (slow):

  1. Open nexa/gguf/nexa_inference_tts.py#L52
  2. Force CPU mode:
-self.device = device
+self.device = "cpu"

Workaround for GPU inference on Windows:

  1. Remove the venv containing nexaai (uninstalling nexaai itself is insufficient)
  2. Install the Vulkan build: pip install nexaai --extra-index-url https://github.nexa.ai/whl/vulkan

In both cases, however, there is background noise in the generated file. A sample can be reviewed here.

OS

Ubuntu 22.04; Windows 11 Pro

Python Version

3.10

Nexa SDK Version

0.0.9.9

GPU (if using one)

RTX 3060

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant