Enable quant model support #1074

jiqing-feng · 2024-12-16T05:35:09Z

This PR could enable BitsandBytes model's support. Even we cannot use fused linear in a quant model, there is still a 5-10% speed-up on llama2-7b when batch_size=1, the speed-ratio will increase by batch size.

Signed-off-by: jiqing-feng <[email protected]>

HuggingFaceDocBuilderDev · 2024-12-16T05:40:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2024-12-18T09:13:06Z

Hi @sywangyi . Please review this PR, thanks.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-02-11T04:21:35Z

Hi @IlyasMoutawwakil @echarlaix . Do you have time to review this PR? Thanks!

IlyasMoutawwakil · 2025-02-11T08:01:09Z

optimum/exporters/ipex/model_patcher.py

-    convert_functions(model, GPT2Block, "forward", _gpt2_block_forward)
-    convert_class(model, GPT2Attention, _IPEXGPT2Attention, model.config)
-    convert_class(model, GPT2MLP, _IPEXGPT2MLP, model.config)
+    convert_class(model, GPT2Block, _IPEXGPT2Block, model.device, model.config)


why no longer patching the mlp and attention here ?

Because they are in the _IPEXGPT2Block

IlyasMoutawwakil · 2025-02-11T08:05:33Z

.github/workflows/test_ipex.yml

+      - name: Install bitsandbytes
+        run: |
+          git clone --branch multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git
+          cd bitsandbytes
+          pip install .


no luck with autoawq ?

Yes, the autoawq installation have some issues, I will figure out why this env cannot install autoawq. But the tests passed in my local env which have the autoawq installed.

IlyasMoutawwakil

LGTM, let's make sure both tests (awq, bnb) are running in the CI or at least passing locally

jiqing-feng · 2025-02-11T08:16:56Z

LGTM, let's make sure both tests (awq, bnb) are running in the CI or at least passing locally

I guess it's because of the python version. I updated python from 3.9 to 3.10, but git the error in CI:

Do you know why python 3.10 was detected as 3.1 ?

IlyasMoutawwakil · 2025-02-11T08:42:19Z

Do you know why python 3.10 was detected as 3.1 ?

that's just github parsing it as a number (3.10=3.1), you need to add "3.10"

.github/workflows/test_ipex.yml

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng · 2025-02-12T04:27:43Z

Hi @IlyasMoutawwakil . I have removed autoawq and only support bitsandbytes in this PR. The autoawq will be supported once we fixed the installation issue.

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng added 10 commits December 9, 2024 12:31

enable IPEXModelForSeq2SeqLM

3888824

Signed-off-by: jiqing-feng <[email protected]>

set static cache

f9fa807

Signed-off-by: jiqing-feng <[email protected]>

add tests for IPEXModelForSeq2SeqLM

202df43

Signed-off-by: jiqing-feng <[email protected]>

add docs

4488073

Signed-off-by: jiqing-feng <[email protected]>

fix readme

16fecf8

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into text2text

de501f4

refactor compile

4225bf0

Signed-off-by: jiqing-feng <[email protected]>

fix check

2ac7ecf

Signed-off-by: jiqing-feng <[email protected]>

fix ruff check

24b988c

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'huggingface:main' into text2text

5c4f9a1

jiqing-feng marked this pull request as draft December 16, 2024 05:35

jiqing-feng added 7 commits December 16, 2024 12:10

enable quantized model

46b93a4

Signed-off-by: jiqing-feng <[email protected]>

add bnb test

82d39ce

Signed-off-by: jiqing-feng <[email protected]>

add bnb tests in yaml

7dc08da

Signed-off-by: jiqing-feng <[email protected]>

fix tests

30027ff

Signed-off-by: jiqing-feng <[email protected]>

disable bnb tests

314db04

Signed-off-by: jiqing-feng <[email protected]>

fix gpt2

87656ca

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into quant

9a7e931

jiqing-feng marked this pull request as ready for review December 18, 2024 05:28

jiqing-feng force-pushed the quant branch from 0c80be3 to b0cec9c Compare December 18, 2024 05:33

jiqing-feng requested a review from sywangyi December 18, 2024 09:13

jiqing-feng added 7 commits December 18, 2024 13:13

set actual device

b0cec9c

Signed-off-by: jiqing-feng <[email protected]>

assign device when convert class

94cf35d

Signed-off-by: jiqing-feng <[email protected]>

fix class init

9af46d1

Signed-off-by: jiqing-feng <[email protected]>

fix ipex attn init

18b2a6a

Signed-off-by: jiqing-feng <[email protected]>

rm set device on config

9f6db33

Signed-off-by: jiqing-feng <[email protected]>

fix format

6d8a969

Signed-off-by: jiqing-feng <[email protected]>

fix mlp class init

dd811f9

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng added 6 commits February 7, 2025 13:56

fix install

ad3467b

Signed-off-by: jiqing-feng <[email protected]>

fix install

4a21d26

Signed-off-by: jiqing-feng <[email protected]>

fix install

fb8002c

Signed-off-by: jiqing-feng <[email protected]>

fix install

2ad2371

Signed-off-by: jiqing-feng <[email protected]>

enable bnb test

3c2ddef

Signed-off-by: jiqing-feng <[email protected]>

remove useless device

757ea8c

Signed-off-by: jiqing-feng <[email protected]>

IlyasMoutawwakil reviewed Feb 11, 2025

View reviewed changes

IlyasMoutawwakil approved these changes Feb 11, 2025

View reviewed changes

IlyasMoutawwakil reviewed Feb 11, 2025

View reviewed changes

.github/workflows/test_ipex.yml Outdated Show resolved Hide resolved

jiqing-feng and others added 6 commits February 11, 2025 16:04

update python to 3.10 on test_ipex

0a6ab0f

Signed-off-by: jiqing-feng <[email protected]>

Apply suggestions from code review

8c4884b

install autoawq

7fa23a5

Signed-off-by: jiqing-feng <[email protected]>

install wheel

5386bbe

Signed-off-by: jiqing-feng <[email protected]>

fix install autoawq

c5f5d16

Signed-off-by: jiqing-feng <[email protected]>

rm autoawq

41513f0

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng force-pushed the quant branch from ce219a4 to 778bf15 Compare February 12, 2025 03:11

jiqing-feng added 7 commits February 12, 2025 09:12

rebase

8e1caa2

Signed-off-by: jiqing-feng <[email protected]>

fix concat qkv

f73c08d

Signed-off-by: jiqing-feng <[email protected]>

fix format

f51777b

Signed-off-by: jiqing-feng <[email protected]>

fix qwen patch

f64b251

Signed-off-by: jiqing-feng <[email protected]>

fix bias

778bf15

Signed-off-by: jiqing-feng <[email protected]>

rm autoawq test

6ba1895

Signed-off-by: jiqing-feng <[email protected]>

fix style

88dba29

Signed-off-by: jiqing-feng <[email protected]>

IlyasMoutawwakil approved these changes Feb 19, 2025

View reviewed changes

IlyasMoutawwakil merged commit bc1d034 into huggingface:main Feb 19, 2025
20 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable quant model support #1074

Enable quant model support #1074

jiqing-feng commented Dec 16, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 16, 2024

jiqing-feng commented Dec 18, 2024

jiqing-feng commented Feb 11, 2025

IlyasMoutawwakil Feb 11, 2025

jiqing-feng Feb 11, 2025

IlyasMoutawwakil Feb 11, 2025

jiqing-feng Feb 11, 2025

IlyasMoutawwakil left a comment

jiqing-feng commented Feb 11, 2025

IlyasMoutawwakil commented Feb 11, 2025

jiqing-feng commented Feb 12, 2025

Enable quant model support #1074

Enable quant model support #1074

Conversation

jiqing-feng commented Dec 16, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Dec 16, 2024

jiqing-feng commented Dec 18, 2024

jiqing-feng commented Feb 11, 2025

IlyasMoutawwakil Feb 11, 2025

Choose a reason for hiding this comment

jiqing-feng Feb 11, 2025

Choose a reason for hiding this comment

IlyasMoutawwakil Feb 11, 2025

Choose a reason for hiding this comment

jiqing-feng Feb 11, 2025

Choose a reason for hiding this comment

IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

jiqing-feng commented Feb 11, 2025

IlyasMoutawwakil commented Feb 11, 2025

jiqing-feng commented Feb 12, 2025

jiqing-feng commented Dec 16, 2024 •

edited

Loading