-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LoRA Training Taking Longer Than Expected Compared to ViTs or Other Models #855
Comments
It's normal to have a longer training time if you finetune a larger model. |
How long it takes? @Stardust-minus |
Varies depends on which GPU you use. |
I use the A100, and its utilization is at 100%, but it takes a long time compared to my previous experience with VITS. The steps increase very slowly. After multiple days, it is still at epoch 0 and is only making minimal progress with the steps, which I think is very inefficient. Additionally, it consumes a huge amount of RAM (~120GB). Why is this happening? @Stardust-minus |
After 22 hours, it doesn’t progress to a better state. Accuracy fluctuates between 46 and 47 percent.
|
Self Checks
Cloud or Self Hosted
Self Hosted (Source)
Environment Details
ubuntu
Steps to Reproduce
I ran training with LoRA.
I don't know why it takes longer compared to ViTs or other models.
Does that make sense? How long should it typically take, for example, on LJSpeech with a T4 GPU?
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
The text was updated successfully, but these errors were encountered: