Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training time cost #12

Open
Liwen-Xiao opened this issue Aug 19, 2024 · 9 comments
Open

Training time cost #12

Liwen-Xiao opened this issue Aug 19, 2024 · 9 comments

Comments

@Liwen-Xiao
Copy link

Hello! Great job! Could you tell me how long your training time is? On my machine, using a single 3090 GPU, the first epoch shows it will take around 8 hours to train. Is this similar to your experience?

@youngzhou1999
Copy link
Collaborator

Hi. Thanks for your interest.
The first epoch is relatively longer as it will process data for training, you can reduce the radius of map retrieval to fasten your training process.

@Liwen-Xiao
Copy link
Author

Thank you for your reply! I changed the radius and the training cost decreased a lot. Thanks again!

@lon0862
Copy link

lon0862 commented Aug 26, 2024

Hi, but I find my training cost is take around 8 hours in each epoch, not only the first epoch. Is this similar to your experience?

@Liwen-Xiao
Copy link
Author

Hi, but I find my training cost is take around 8 hours in each epoch, not only the first epoch. Is this similar to your experience?

Yes, it is the same. I changed the local radius to a smaller one and solved the problem.

@lon0862
Copy link

lon0862 commented Aug 26, 2024

Yes, it is the same. I changed the local radius to a smaller one and solved the problem.

Thanks, I do the same thing to solved it, too.

@lon0862
Copy link

lon0862 commented Sep 8, 2024

@Liwen-Xiao Hi, I want to ask about the retrained result. I use 1 GPU, batch_size: 32, accumulate by 2 batches。Final only get val_minADE: 0.6525, val_minFDE: 0.9325, val_MR: 0.0880。And in preprocess data, I use local_radius:65。
I want to know have you get the result same as github checkpoints?

@Liwen-Xiao
Copy link
Author

Hi, I also retrained the model. I got val_minFDE as 0.919, which is close to 0.913 as the author reported. I use 1 GPU, batch size is 32, and the epoch is 64. I get the best result at the 33rd epoch.

@lon0862
Copy link

lon0862 commented Sep 9, 2024

Hi, I use the same parameter as you show. And I get the best result at the 31st epoch. which val_minADE is 0.649,
val_minFDE:0.924, val_MR:0.086. It is a little worse than yours and author reported, but I think it is acceptable. Thanks!

@lon0862
Copy link

lon0862 commented Sep 9, 2024

But I still confuse about the quality score using, do you try it as paper says?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants