Releases · huggingface/optimum-intel

05 Aug 15:26

v1.3.0

9d15417

v1.3.0: Knowledge distillation and one-shot optimization support

Knowledge distillation

Knowledge distillation was introduced in #8. To perform distillation, an IncDistiller must be instantiated with the appropriate configuration.

One-shot optimization

The possibility to combine compression techniques such as pruning, knowledge distillation and quantization aware training in one-shot during training was introduced (#7). One-shot optimization is set by default, but can be cancelled by setting the one_shot_optimization parameter to False when instantiating the IncOptimizer.

Seq2Seq models support

Both quantization and pruning can now be applied on Seq2Seq models (#14)

Assets 2

15 Jun 13:13

echarlaix

v1.2.3

1129cec

v1.2.3: Patch release

Add the save_pretrained method to the ORTOptimizer to easily save the resulting quantized and / or pruned model, with its corresponding configuration (needed to load a quantized model) (#4)
Remove the outdated fit method as well as the model attribute of IncQuantizer and IncPruner (#4)

Assets 2

07 Jun 13:22

echarlaix

v1.2.2

61061ff

v1.2.2: Initial release of Optimum Intel featuring INC quantization and pruning support

With this release, we enable Intel Neural Compressor (INC) automatic accuracy-driven tuning strategies for model quantization, in order for users to easily generate quantized model for different quantization approaches (including static, dynamic and aware-training quantization). This support includes the overall process, from quantization application to the loading of the resulting quantized model. The latter being enabled by the introduction of the IncQuantizedModel class.
Magnitude pruning is also enabled for a variety of tasks with the introduction of an IncTrainer handling the pruning process.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Knowledge distillation

One-shot optimization

Seq2Seq models support

Releases: huggingface/optimum-intel

v1.3.0: Knowledge distillation and one-shot optimization support

Knowledge distillation

One-shot optimization

Seq2Seq models support

v1.2.3: Patch release

v1.2.2: Initial release of Optimum Intel featuring INC quantization and pruning support