DeepSeek-R1 support #174

loretoparisi · 2025-02-07T19:18:50Z

Add support or a recipe to quantize DeepSeek-R1 and related distilled versions DeepSeek-R1-Distill-Llama-70B, DeepSeek-R1-Distill-Qwen-32B, eventually including 7B, 14B Qwen, and Llama 8B

YangWang92 · 2025-02-08T02:17:18Z

We are almost there, and I have prepared the hessian collection codes for Deepseek R1/V3. Please wait for us for one week.

wejoncy · 2025-02-08T05:49:54Z

Here is a way to quantize those related distilled versions

install vptq-algorithm by following https://github.com/microsoft/VPTQ/blob/algorithm/algorithm.md#environment-setting
install qllm by pip install qllm
create a example quant-config.json python -m qllm --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --quant_method=vptq --quant_config=help
copy and save the config, edit those values with your thought
do quatization python -m qllm --model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --quant_method=vptq --quant_config=xx.json --save DeepSeek-R1-Distill-Qwen-32B-vptq

YangWang92 · 2025-02-12T02:42:05Z

Some updates: we have already collected all hessian matrices and are trying to adapt the quantization algorithm.

YangWang92 · 2025-02-21T04:52:44Z

Some updates: I have adapted the algorithm to deepseek models and stay tuned.

YangWang92 · 2025-03-03T08:03:53Z

Now, we have an early ~2 bits version of Deepseek R1, and it works well on 4xA100 80G.

loretoparisi · 2025-03-03T11:30:59Z

Now, we have an early ~2 bits version of Deepseek R1, and it works well on 4xA100 80G.

💯 This is truly amazing.
Running a test on our 4 x H100 today!

YangWang92 · 2025-03-03T12:22:11Z

Now, we have an early ~2 bits version of Deepseek R1, and it works well on 4xA100 80G.

💯 This is truly amazing. Running a test on our 4 x H100 today!

The current inference speed is still a bit slow (based on deepseek repo's bare torch design). Please wait a moment while I prepare it—you can give it a try soon.

YangWang92 self-assigned this Feb 8, 2025

YangWang92 added the new models require new models label Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSeek-R1 support #174

DeepSeek-R1 support #174

loretoparisi commented Feb 7, 2025

YangWang92 commented Feb 8, 2025

wejoncy commented Feb 8, 2025 •

edited

Loading

YangWang92 commented Feb 12, 2025

YangWang92 commented Feb 21, 2025

YangWang92 commented Mar 3, 2025

loretoparisi commented Mar 3, 2025

YangWang92 commented Mar 3, 2025

DeepSeek-R1 support #174

DeepSeek-R1 support #174

Comments

loretoparisi commented Feb 7, 2025

YangWang92 commented Feb 8, 2025

wejoncy commented Feb 8, 2025 • edited Loading

YangWang92 commented Feb 12, 2025

YangWang92 commented Feb 21, 2025

YangWang92 commented Mar 3, 2025

loretoparisi commented Mar 3, 2025

YangWang92 commented Mar 3, 2025

wejoncy commented Feb 8, 2025 •

edited

Loading