Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU usage is very low, some bottleneck? #897

Open
6 tasks done
VanceVagell opened this issue Feb 22, 2025 · 2 comments
Open
6 tasks done

CPU usage is very low, some bottleneck? #897

VanceVagell opened this issue Feb 22, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@VanceVagell
Copy link

VanceVagell commented Feb 22, 2025

Self Checks

  • This template is only for bug reports. For questions, please visit Discussions.
  • I have thoroughly reviewed the project documentation (installation, training, inference) but couldn't find information to solve my problem. English 中文 日本語 Portuguese (Brazil)
  • I have searched for existing issues, including closed ones. Search issues
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template and fill in all required fields.

Cloud or Self Hosted

Self Hosted (Source)

Environment Details

I'm running a CPU-only inference setup on a 32-core / 64-thread EPYC server. I know it will be slower than GPU, that's not the issue. This is a Debian linux server.

During inference (fish_speech/models/text2semantic/inference.py), it does indeed use the 32 cores because I can see the child processes popping up in htop. However, these child processes only use 4-8% of each CPU while running, and this step takes extremely long other than for very short text (a few words).

There seems to be some kind of bottleneck that's causing the CPUs to only run sporadically. This system has more than enough memory (1TB) to load everything into RAM, though I don't think caching is the issue.

Any ideas what could be wrong here? I'd like to see these child processes running at full capacity.

Steps to Reproduce

Here's the command line I'm running:

cd /home/<user>/fish-speech && conda run -n fish-speech \
    python fish_speech/models/text2semantic/inference.py \
    --text "<text I want rendered>" \
    --prompt-text "<text from my short example .wav>" \
    --prompt-tokens "/home/<user>/fish-speech/fake.npy"
    --checkpoint-path "/home/<user>/fish-speech/checkpoints/fish-speech-1.5" \
    --num-samples 1 \
    --no-compile \
    --device cpu \
    --half \
    --output-dir /home/<user>/out |& tee -a fish-speech.log

Attached is a screenshot of what htop looks like while this is running.

Image

✔️ Expected Behavior

CPU usage on sub-processes should be more like 80-100% while running.

❌ Actual Behavior

CPU usage on sub-processes is usually single-digits %.

@VanceVagell VanceVagell added the bug Something isn't working label Feb 22, 2025
@VanceVagell
Copy link
Author

One thing I notice in htop is that all the sub-processes are in "S" or "interruptible sleep" state most of the time, only ocassionally changing to "R" or "running" state. So it seems like the threads are waiting around for something.

I notice that they all seem to switch to running at the same time, though that might just be the htop update pulse. The vast majority of the time they are "S", and they occassionally pop into "R", and they seem to almost all be the same either "S" or "R" at any given moment.

@Stardust-minus
Copy link
Member

You should try to let torch use all the cpu cores.
Please Google for specific methods

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants