Increasing performance of 65B llama #369

SpeedyCraftah · 2023-03-21T20:40:29Z

SpeedyCraftah
Mar 21, 2023

I am wondering if there is a way of potentially speeding up the 65B llama model? I have 32GB ram at 3600mhz with a ryzen 7 5800x and it's maxing both out (ryzen 7 maxed out probably because of RAM swap) and it generates tokens at a rate of 2-3 minutes per token with my RAM being maxed out.

I am running on windows 11 but thought to perhaps try to get Linux on a flash drive and attempt to run the model from there to see if there's an improvement? Since linux probably handles swap a little better as well as using much less RAM than windows.

ggerganov · 2023-03-21T21:04:44Z

ggerganov
Mar 21, 2023
Maintainer

64GB should make a big difference. With 32GB it is likely swapping on the hard disk all the time

1 reply

SpeedyCraftah Mar 21, 2023
Author

That's the obvious solution, although it's not really in my budget to spend an additional £100 on RAM right now 😂
It's good that it even runs though.

bitRAKE · 2023-03-21T22:26:08Z

bitRAKE
Mar 21, 2023

This doesn't answer your question, but I think the 65B parameter model isn't worth it - I'm getting much better responses from the 30B model. Could just be that more training/tuning is needed. Their paper says the 65B model is better - I just haven't seen it.

1 reply

omarkazmi Mar 21, 2023

I think the difference might not be as apparent if you're using the models for chat responses and short bursts. I've been generating long-form stories by feeding llama opening scenarios and dialogue and letting it run, and the difference between 30B and 65B is notable. 30B creates a coherent and serviceable narrative; If you're lucky, 65B can include complex emotional nuance, wordplay and humour, and sometimes really inventive plot twists. Just a better general grasp of culture and conceptual associations. It's like the difference between a high-school writer and a college grad at times.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increasing performance of 65B llama #369

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Increasing performance of 65B llama #369

SpeedyCraftah Mar 21, 2023

Replies: 2 comments · 2 replies

ggerganov Mar 21, 2023 Maintainer

SpeedyCraftah Mar 21, 2023 Author

bitRAKE Mar 21, 2023

omarkazmi Mar 21, 2023

SpeedyCraftah
Mar 21, 2023

Replies: 2 comments 2 replies

ggerganov
Mar 21, 2023
Maintainer

SpeedyCraftah Mar 21, 2023
Author

bitRAKE
Mar 21, 2023