CPU usage is very low, some bottleneck? #897

VanceVagell · 2025-02-22T19:19:35Z

Self Checks

This template is only for bug reports. For questions, please visit Discussions.
I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文日本語 Portuguese (Brazil)
I have searched for existing issues, including closed ones. Search issues
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

I'm running a CPU-only inference setup on a 32-core / 64-thread EPYC server. I know it will be slower than GPU, that's not the issue. This is a Debian linux server.

During inference (fish_speech/models/text2semantic/inference.py), it does indeed use the 32 cores because I can see the child processes popping up in htop. However, these child processes only use 4-8% of each CPU while running, and this step takes extremely long other than for very short text (a few words).

There seems to be some kind of bottleneck that's causing the CPUs to only run sporadically. This system has more than enough memory (1TB) to load everything into RAM, though I don't think caching is the issue.

Any ideas what could be wrong here? I'd like to see these child processes running at full capacity.

Steps to Reproduce

Here's the command line I'm running:

cd /home/<user>/fish-speech && conda run -n fish-speech \
    python fish_speech/models/text2semantic/inference.py \
    --text "<text I want rendered>" \
    --prompt-text "<text from my short example .wav>" \
    --prompt-tokens "/home/<user>/fish-speech/fake.npy"
    --checkpoint-path "/home/<user>/fish-speech/checkpoints/fish-speech-1.5" \
    --num-samples 1 \
    --no-compile \
    --device cpu \
    --half \
    --output-dir /home/<user>/out |& tee -a fish-speech.log

Attached is a screenshot of what htop looks like while this is running.

✔️ Expected Behavior

CPU usage on sub-processes should be more like 80-100% while running.

❌ Actual Behavior

CPU usage on sub-processes is usually single-digits %.

The text was updated successfully, but these errors were encountered:

VanceVagell · 2025-02-22T19:30:18Z

One thing I notice in htop is that all the sub-processes are in "S" or "interruptible sleep" state most of the time, only ocassionally changing to "R" or "running" state. So it seems like the threads are waiting around for something.

I notice that they all seem to switch to running at the same time, though that might just be the htop update pulse. The vast majority of the time they are "S", and they occassionally pop into "R", and they seem to almost all be the same either "S" or "R" at any given moment.

Stardust-minus · 2025-02-23T04:47:48Z

You should try to let torch use all the cpu cores.
Please Google for specific methods

VanceVagell added the bug Something isn't working label Feb 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU usage is very low, some bottleneck? #897

CPU usage is very low, some bottleneck? #897

VanceVagell commented Feb 22, 2025 •

edited

Loading

VanceVagell commented Feb 22, 2025

Stardust-minus commented Feb 23, 2025

CPU usage is very low, some bottleneck? #897

CPU usage is very low, some bottleneck? #897

Comments

VanceVagell commented Feb 22, 2025 • edited Loading

Self Checks

Cloud or Self Hosted

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

VanceVagell commented Feb 22, 2025

Stardust-minus commented Feb 23, 2025

VanceVagell commented Feb 22, 2025 •

edited

Loading