-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor OV weight compression call inside from_pretrained #683
Refactor OV weight compression call inside from_pretrained #683
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@AlexKoff88 @echarlaix could you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks @nikita-savelyevv
from optimum.gptq.data import get_dataset, prepare_dataset | ||
|
||
tokenizer = AutoTokenizer.from_pretrained(quantization_config.tokenizer) | ||
nsamples = quantization_config.num_samples if quantization_config.num_samples else 128 | ||
calibration_dataset = get_dataset( | ||
quantization_config.dataset, tokenizer, seqlen=32, nsamples=nsamples | ||
) | ||
calibration_dataset = prepare_dataset(calibration_dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be done for every OVModel no ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This particular part is for OVModelForCausalLM
only. First of all, because GPTQ dataset creation logic is employed which is applicable for LLMs only. Secondly, self.model
is required to have prepare_inputs
method, which is specific for OVModelForCausalLM
.
In theory we could extend this part to other model classes. There is some logic for the SD model class and I plan to migrate it to OVQuantizer
in a future PR. There's also get_calibration_dataset
method, maybe it should actually go there/be extended to multiple model types. Will need to think about it.
For other model types there is no such logic in the codebase at the moment if I'm not mistaken, so I'm not yet sure about those. Maybe we could add it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I think it makes sense to make it available for other OVModel
and to also extend get_calibration_dataset
, but this can be done in a following PR !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also we could add a warning that the dataset config argument will be ignored for models that are not an instances of OVModelForCausalLM
elif config.dataset is not None and isinstance(config.dataset, str): | ||
tokenizer = AutoTokenizer.from_pretrained(config.tokenizer) | ||
|
||
from optimum.gptq.data import get_dataset, prepare_dataset | ||
|
||
nsamples = config.num_samples if config.num_samples else 128 | ||
dataset = get_dataset(config.dataset, tokenizer, seqlen=32, nsamples=nsamples) | ||
dataset = prepare_dataset(dataset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I did want to do that, but the difference there is that only a raw openvino.runtime.Model
is available, not an instance of transformers.PreTrainedModel
, and the latter is required to initialize OVQuantizer
. We could extend OVQuantizer
to accept an instance of openvino.runtime.Model
, but that's a rather serious API change.
What does this PR do?
Address #618 (comment)
Changes:
save_directory
parameter forOVQuantizer.quantize()
methodOVModelForCausalLM
toOVQuantizer
In a similar fashion in the future I plan to move SD dataset collection logic to OVQuantizer
Before submitting