Skip to content

Releases: huggingface/optimum-intel

v1.3.0: Knowledge distillation and one-shot optimization support

05 Aug 15:26
Compare
Choose a tag to compare

Knowledge distillation

Knowledge distillation was introduced in #8. To perform distillation, an IncDistiller must be instantiated with the appropriate configuration.

One-shot optimization

The possibility to combine compression techniques such as pruning, knowledge distillation and quantization aware training in one-shot during training was introduced (#7). One-shot optimization is set by default, but can be cancelled by setting the one_shot_optimization parameter to False when instantiating the IncOptimizer.

Seq2Seq models support

Both quantization and pruning can now be applied on Seq2Seq models (#14)

v1.2.3: Patch release

15 Jun 13:13
Compare
Choose a tag to compare
  • Add the save_pretrained method to the ORTOptimizer to easily save the resulting quantized and / or pruned model, with its corresponding configuration (needed to load a quantized model) (#4)
  • Remove the outdated fit method as well as the model attribute of IncQuantizer and IncPruner (#4)

v1.2.2: Initial release of Optimum Intel featuring INC quantization and pruning support

07 Jun 13:22
Compare
Choose a tag to compare

With this release, we enable Intel Neural Compressor (INC) automatic accuracy-driven tuning strategies for model quantization, in order for users to easily generate quantized model for different quantization approaches (including static, dynamic and aware-training quantization). This support includes the overall process, from quantization application to the loading of the resulting quantized model. The latter being enabled by the introduction of the IncQuantizedModel class.
Magnitude pruning is also enabled for a variety of tasks with the introduction of an IncTrainer handling the pruning process.