Skip to content

v1.5.0: OpenVINO quantization

Compare
Choose a tag to compare
@echarlaix echarlaix released this 21 Oct 10:07
· 911 commits to main since this release

Quantization

  • Add OVQuantizer enabling OpenVINO NNCF post-training static quantization (#50)
  • Add OVTrainer enabling OpenVINO NNCF quantization aware training (#67)
  • Add OVConfig the configuration which contains the quantization process informations (#65)

The quantized model resulting from the OVQuantizer and the OVTrainer are exported to the OpenVINO IR and can be loaded with the corresponding OVModelForXxx to perform inference with OpenVINO Runtime.

OVModel

Add OVModelForCausalLM enabling OpenVINO Runtime for models with a causal language modeling head (#76)