-
Hello, I wanted to know how does the UNK token is used in generation, or if it is even used. I assume the UNK token is used when there is no token with high probability during the decoding process. Does anyone have an idea? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Did you ever see it used ? Mistral, llama2, Falcon they all use BPE tokenization so they are not really short of expression. I don't know for certain but my guess is that UNK is mostly a relict of older smaller language models. |
Beta Was this translation helpful? Give feedback.
-
The only time I've seen UNK is when I messed up my code calling into llama.cpp and got the order of things wrong, or didn't fill out the batch struct properly. UNK has the token ID of 0. |
Beta Was this translation helpful? Give feedback.
Did you ever see it used ?
Mistral, llama2, Falcon they all use BPE tokenization so they are not really short of expression.
UNK is supposed to be used for unknown words that can not be tokenized, with BPE you can tokenize everything and if something can not be tokenized llama.cpp currently crashes :) So no UNK there.
I don't know for certain but my guess is that UNK is mostly a relict of older smaller language models.