Releases: huggingface/optimum-intel
v1.17.0: ITREX WOQ, IPEX pipeline, extended OpenVINO export
OpenVINO
-
Enable BioGPT, Cohere, Persimmon, XGLM export by @eaidova in #709
-
Add OVModelForVision2Seq class by @eaidova in #634
from optimum.intel import OVModelForVision2Seq model = OVModelForVision2Seq.from_pretrained("nlpconnect/vit-gpt2-image-captioning", export=True) gen_tokens = model.generate(**inputs)
-
Introduce OVQuantizationConfig for NNCF quantization by @nikita-savelyevv in #638
-
Enable hybrid StableDiffusion models export via optimum-cli by @l-bat in #618
optimum-cli export openvino --model SimianLuo/LCM_Dreamshaper_v7 --task latent-consistency --dataset conceptual_captions --weight-format int8 <output_dir>
-
Convert Tokenizers by default by @apaniukov in #580
-
Custom tasks modeling by @IlyasMoutawwakil in #669
-
Add dynamic quantization config by @echarlaix in #661
from optimum.intel import OVModelForCausalLM, OVDynamicQuantizationConfig model_id = "meta-llama/Meta-Llama-3-8B" q_config = OVDynamicQuantizationConfig(bits=8, activations_group_size=32) model = OVModelForCausalLM.from_pretrained(model_id, export=True, quantization_config=q_config)
-
Transition to a newer NNCF API for PyTorch model quantization by @nikita-savelyevv in #630
ITREX
- Add ITREX weight-only quantization support by @PenghuiCheng in #455
IPEX
- Add IPEX pipeline by @jiqing-feng in #501
v1.16.1: Patch release
- Bump transformers version by @echarlaix in #682
v1.16.0: OpenVINO config, SD hybrid quantization
Add hybrid quantization for Stable Diffusion pipelines by @l-bat in #584
from optimum.intel import OVStableDiffusionPipeline, OVWeightQuantizationConfig
model_id = "echarlaix/stable-diffusion-v1-5-openvino"
quantization_config = OVWeightQuantizationConfig(bits=8, dataset="conceptual_captions")
model = OVStableDiffusionPipeline.from_pretrained(model_id, quantization_config=quantization_config)
Add openvino export configs by @eaidova in #568
Enabling OpenVINO export for the following architectures enabled : Mixtral, ChatGLM, Baichuan, MiniCPM, Qwen, Qwen2, StableLM
Add support for export and inference for StarCoder2 models by @eaidova in #619
v1.15.2: Patch release
- Fix compatibility for
transformers>=4.38.0
by @echarlaix in #570
v1.15.1: Patch release
-
Relax dependency on accelerate and datasets in OVQuantizer by @eaidova in #547
-
Disable compilation before applying 4-bit weight compression by @AlexKoff88 in #569
-
Update Transformers dependency requirements by @echarlaix in #571
v1.15.0: OpenVINO Tokenizers, quantization configuration
-
Add OpenVINO Tokenizers by @apaniukov #513
-
Introduce the OpenVINO quantization configuration by @AlexKoff88 #538
-
Enable model OpenVINO export by @echarlaix in #557
from diffusers import StableDiffusionPipeline
from optimum.exporters.openvino import export_from_model
model_id = "runwayml/stable-diffusion-v1-5"
model = StableDiffusionPipeline.from_pretrained(model_id)
export_from_model(model, output="ov_model", task="stable-diffusion")
v1.14.0: IPEX models
IPEX models
from optimum.intel import IPEXModelForCausalLM
from transformers import AutoTokenizer, pipeline
model_id = "Intel/q8_starcoder"
model = IPEXModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
results = pipe("He's a dreadful magician and")
-
Add IPEX models by @echarlaix in #516 / #534 / #536
Fixes
v1.13.0: 4-bit quantization, stateful models, Whisper
OpenVINO
Weight only 4-bit quantization
- Add weight only 4-bit quantization support by @AlexKoff88 in #469
optimum-cli export openvino --model gpt2 --weight-format int4_sym_g128 ov_model
Stateful
New architectures
Whisper
v1.12.4: Patch release
- Fix compatibility with
transformers
v4.37.0 by @echarlaix in #515 - Fix compatibility with
transformers
v4.37.0 by @echarlaix in #527
v1.12.3: Patch release
- Fix compatibility with
diffusers
v0.25.0 by @eaidova in #497 - Modify minimum required
transformers
version by @echarlaix in #498