Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation Mm-cot #55

Open
Billyroot opened this issue Jul 29, 2023 · 1 comment
Open

Implementation Mm-cot #55

Billyroot opened this issue Jul 29, 2023 · 1 comment

Comments

@Billyroot
Copy link

Great work from yourself and your team. Quick question,we are thinking about using that method with a largeur Falcon model. do you think we Can have therefore a greater gap of performance with gpt 3.5?the idea being if a 1b model Can Do that, what Can be with a 40b model.

@cooelf
Copy link
Contributor

cooelf commented Oct 15, 2023

Not sure about that. However we did see that when using a T5-style encoder-decoder model, a larger model achieves better performance. Due to the resource limit, we did not scale to models larger than 1B.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants