(Stupid?) Q6/5 quantization idea #9539

marcingomulkiewicz · 2024-09-18T17:32:19Z

marcingomulkiewicz
Sep 18, 2024

(Apologies if that has been discussed earlier, but I couldn't find it)

Q5 is nice, because it's relatively good perplexity-wise, yet 5-bit integers fit poorly into bytes, which leads to lots of operations. Q6 is even better - but still does not quite match bytes too well. OTOH 3x5 is almost 16 (two bytes) so maybe we can cheat a bit (pun intended)...? My thinking is:

Say we have 3 6-bit long weights:

001100
011001
111001

We calculate the majority function of the LSBs (here: 0, 1, 1 -> 1) and we store 5 bit long ints + the majority function in 2 bytes:

00110 01100 11100 1

so then we restore it as

001101
011001
111001

It's easy to see that out of those 3 weights 2 will always be restored to their full 6-bit long glory, and the third will either be correct (if we were lucky) or at most off by one (1/64th of the weight). So, we have stored 3 weights with precision of almost 6 bits, yet used only 1/15th more space than when using 'pure' 5 bit quantization, plus restoring those should hopefully be simpler than for the 5-bit quant, so it has a chance for a faster inference.

Does this make any sense?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Stupid?) Q6/5 quantization idea #9539

{{title}}

Replies: 0 comments

Select a reply

(Stupid?) Q6/5 quantization idea #9539

marcingomulkiewicz Sep 18, 2024

Replies: 0 comments

marcingomulkiewicz
Sep 18, 2024