(Stupid?) Q6/5 quantization idea #9539
marcingomulkiewicz
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
(Apologies if that has been discussed earlier, but I couldn't find it)
Q5 is nice, because it's relatively good perplexity-wise, yet 5-bit integers fit poorly into bytes, which leads to lots of operations. Q6 is even better - but still does not quite match bytes too well. OTOH 3x5 is almost 16 (two bytes) so maybe we can cheat a bit (pun intended)...? My thinking is:
Say we have 3 6-bit long weights:
001100
011001
111001
We calculate the majority function of the LSBs (here: 0, 1, 1 -> 1) and we store 5 bit long ints + the majority function in 2 bytes:
00110 01100 11100 1
so then we restore it as
001101
011001
111001
It's easy to see that out of those 3 weights 2 will always be restored to their full 6-bit long glory, and the third will either be correct (if we were lucky) or at most off by one (1/64th of the weight). So, we have stored 3 weights with precision of almost 6 bits, yet used only 1/15th more space than when using 'pure' 5 bit quantization, plus restoring those should hopefully be simpler than for the 5-bit quant, so it has a chance for a faster inference.
Does this make any sense?
Beta Was this translation helpful? Give feedback.
All reactions