Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable quant model support #1074

Merged
merged 65 commits into from
Feb 19, 2025
Merged

Conversation

jiqing-feng
Copy link
Collaborator

@jiqing-feng jiqing-feng commented Dec 16, 2024

This PR could enable BitsandBytes model's support. Even we cannot use fused linear in a quant model, there is still a 5-10% speed-up on llama2-7b when batch_size=1, the speed-ratio will increase by batch size.

@jiqing-feng jiqing-feng marked this pull request as draft December 16, 2024 05:35
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
@jiqing-feng jiqing-feng marked this pull request as ready for review December 18, 2024 05:28
@jiqing-feng
Copy link
Collaborator Author

Hi @sywangyi . Please review this PR, thanks.

@jiqing-feng jiqing-feng requested a review from sywangyi December 18, 2024 09:13
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
@jiqing-feng
Copy link
Collaborator Author

Hi @IlyasMoutawwakil @echarlaix . Do you have time to review this PR? Thanks!

Comment on lines -113 to +112
convert_functions(model, GPT2Block, "forward", _gpt2_block_forward)
convert_class(model, GPT2Attention, _IPEXGPT2Attention, model.config)
convert_class(model, GPT2MLP, _IPEXGPT2MLP, model.config)
convert_class(model, GPT2Block, _IPEXGPT2Block, model.device, model.config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why no longer patching the mlp and attention here ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because they are in the _IPEXGPT2Block

Comment on lines +41 to +45
- name: Install bitsandbytes
run: |
git clone --branch multi-backend-refactor https://github.com/bitsandbytes-foundation/bitsandbytes.git
cd bitsandbytes
pip install .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no luck with autoawq ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the autoawq installation have some issues, I will figure out why this env cannot install autoawq. But the tests passed in my local env which have the autoawq installed.

Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's make sure both tests (awq, bnb) are running in the CI or at least passing locally

@jiqing-feng
Copy link
Collaborator Author

LGTM, let's make sure both tests (awq, bnb) are running in the CI or at least passing locally

I guess it's because of the python version. I updated python from 3.9 to 3.10, but git the error in CI:
image

Do you know why python 3.10 was detected as 3.1 ?

@IlyasMoutawwakil
Copy link
Member

Do you know why python 3.10 was detected as 3.1 ?

that's just github parsing it as a number (3.10=3.1), you need to add "3.10"

jiqing-feng and others added 6 commits February 11, 2025 16:04
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
@jiqing-feng
Copy link
Collaborator Author

Hi @IlyasMoutawwakil . I have removed autoawq and only support bitsandbytes in this PR. The autoawq will be supported once we fixed the installation issue.

Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
@IlyasMoutawwakil IlyasMoutawwakil merged commit bc1d034 into huggingface:main Feb 19, 2025
20 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants