Skip to content

Llama3 8B IQ 1 M takes 22 hours for perplexity testing on wiki text 2 #9486

Closed Answered by joseph777111
Abhranta asked this question in Q&A
Discussion options

You must be logged in to vote

https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md#cuda

GGML_USE_CUDA is not a flag.

Use Make

make GGML_CUDA=1

or

Use CMake

cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@Abhranta
Comment options

Answer selected by Abhranta
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants