You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I did read about your awesome work on reducing the original deepseek r1 weight and making it possible to run it with less number of high quality GPU's with decent speed from this blog - https://unsloth.ai/blog/deepseekr1-dynamic
I have read some good reviews about 32B distill model so i wanted to test it out personally :-) BUT i have 8GB Vram like many others (laptop user here :)) , so it wont be possible for me to run that distill model even with K_M 4 quant so i request you if it is possible then please create 1.58 bit for this variant and maybe later for 14B one too so that many more can test and see how much performance we are able to get with these new type of thinking based models.
Thankyou very much for your work on quantization unsloth devs :-)
The text was updated successfully, but these errors were encountered:
Greatz08
changed the title
DeepSeek-R1-Distill-Qwen-32B-GGUF 1.58-bit version possible ?
[REQUEST] DeepSeek-R1-Distill-Qwen-32B-GGUF 1.58-bit version possible ?
Jan 30, 2025
@danielhanchen i understand your concern regarding that and i feel the same that it "CAN" break the model, BUT still i will request you to atleast give it a shot for us (poor laptop guys) who cant run that big model anytime soon. That model can perform very well as per benchmarks and that's why we are desperately waiting to test and you know without extreme quantization its not possible.I personally tested 22B IQ2(extremely lowest possible quantized codestral model from mistral which was focused on coding and it did perform great too for me at lowest quantization "at that time", so i believe with thinking abilities we can still get decent performance with much more quantization.So again i can only request on behalf of others to give it a shot atleast or maybe find better solution to quantize it in such a way that we can get decent performance.
:-)
Rest i will leave it you as we are not capable enough to find solution or convert it to 1.58bits quantized. But we can for sure test or provide any information which you think we are capable of giving you back :-) Remember
I did read about your awesome work on reducing the original deepseek r1 weight and making it possible to run it with less number of high quality GPU's with decent speed from this blog -
https://unsloth.ai/blog/deepseekr1-dynamic
I have read some good reviews about 32B distill model so i wanted to test it out personally :-) BUT i have 8GB Vram like many others (laptop user here :)) , so it wont be possible for me to run that distill model even with K_M 4 quant so i request you if it is possible then please create 1.58 bit for this variant and maybe later for 14B one too so that many more can test and see how much performance we are able to get with these new type of thinking based models.
Thankyou very much for your work on quantization unsloth devs :-)
The text was updated successfully, but these errors were encountered: