Releases: huggingface/optimum-intel
v1.3.0: Knowledge distillation and one-shot optimization support
Knowledge distillation
Knowledge distillation was introduced in #8. To perform distillation, an IncDistiller
must be instantiated with the appropriate configuration.
One-shot optimization
The possibility to combine compression techniques such as pruning, knowledge distillation and quantization aware training in one-shot during training was introduced (#7). One-shot optimization is set by default, but can be cancelled by setting the one_shot_optimization
parameter to False
when instantiating the IncOptimizer
.
Seq2Seq models support
Both quantization and pruning can now be applied on Seq2Seq models (#14)
v1.2.3: Patch release
v1.2.2: Initial release of Optimum Intel featuring INC quantization and pruning support
With this release, we enable Intel Neural Compressor (INC) automatic accuracy-driven tuning strategies for model quantization, in order for users to easily generate quantized model for different quantization approaches (including static, dynamic and aware-training quantization). This support includes the overall process, from quantization application to the loading of the resulting quantized model. The latter being enabled by the introduction of the IncQuantizedModel
class.
Magnitude pruning is also enabled for a variety of tasks with the introduction of an IncTrainer
handling the pruning process.