[Model Request] Molmo-72B-0924 #408

dprokhorov17 · 2025-01-11T19:06:14Z

Hello,

would you guys please take a look at this great model https://huggingface.co/allenai/Molmo-7B-D-0924 and quantize it?

Thanks in advance.

wenhuach21 · 2025-01-13T09:21:09Z

We will take a look and keep you updated. 7B or 72B?

dprokhorov17 · 2025-01-13T09:22:27Z

72B is more interesting

wenhuach21 · 2025-01-14T07:07:17Z

7B has been uploaded and is available at: Hugging Face. Please allow additional time for the quantization of the 72B model.

dprokhorov17 · 2025-01-14T10:19:26Z

Hello @wenhuach21 ,

I have tested your quantized model against the default 7B model with the following results:

The performance drop is massive! That's 2x slower then the original...

PS: I quickly hacked the openedai-vision repo with your implementation you have provided (https://huggingface.co/OPEA/Molmo-7B-D-0924-int4-sym-inc)

wenhuach21 · 2025-01-14T10:31:21Z

While INT4 models typically offer faster performance during the generation phase due to reduced memory usage, the perfill stage (prompt processing) may be slower compared to 16-bit models, as it is more computation-bound. Consequently, the performance difference between INT4 and 16-bit models largely depends on the length of the prompt and the number of generation tokens. For vlms, there are some extra prefilled tokens introduced by images/videos.

Another option is to conduct computations using the INT8 data type, which I believe is supported by Intel's extension for PyTorch on CPUs. It might be worth trying this approach..

wenhuach21 assigned n1ck-guo Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model Request] Molmo-72B-0924 #408

[Model Request] Molmo-72B-0924 #408

dprokhorov17 commented Jan 11, 2025

wenhuach21 commented Jan 13, 2025

dprokhorov17 commented Jan 13, 2025

wenhuach21 commented Jan 14, 2025

dprokhorov17 commented Jan 14, 2025 •

edited

Loading

wenhuach21 commented Jan 14, 2025 •

edited

Loading

[Model Request] Molmo-72B-0924 #408

[Model Request] Molmo-72B-0924 #408

Comments

dprokhorov17 commented Jan 11, 2025

wenhuach21 commented Jan 13, 2025

dprokhorov17 commented Jan 13, 2025

wenhuach21 commented Jan 14, 2025

dprokhorov17 commented Jan 14, 2025 • edited Loading

wenhuach21 commented Jan 14, 2025 • edited Loading

dprokhorov17 commented Jan 14, 2025 •

edited

Loading

wenhuach21 commented Jan 14, 2025 •

edited

Loading