Releases: huggingface/optimum-intel
Releases · huggingface/optimum-intel
v1.12.2: Patch release
- Fix compatibility with timm latest release by @echarlaix in #482
- Fix causallm weights compression via quantizer by @eaidova #484
- Fix pkv dtype by @jiqing-feng #481
- Fix compatibility causallm models export with optimum 1.15 by @eaidova #487
- Fix trainer compatibility with transformers>=4.36.0 by @echarlaix #490
- Fix openvino export by @eaidova #470
- Fix INC quantized model loading by @echarlaix #492
v1.12.1: Patch release
v1.12.0: Weight only quantization, LCM, Pix2Struct , GPTBigCode
OpenVINO
Export CLI
- Add OpenVINO export CLI by @echarlaix in #437
optimum-cli export openvino --model gpt2 ov_model
New architectures
LCMs
- Enable Latent Consistency models OpenVINO export and inference by @echarlaix in #463
from optimum.intel import OVLatentConsistencyModelPipeline
pipe = OVLatentConsistencyModelPipeline.from_pretrained("SimianLuo/LCM_Dreamshaper_v7", export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=8.0).images
Pix2Struct
GPTBigCode
- Add support for export and inference for GPTBigCode models by @echarlaix in #459
Changes and bugfixes
- Move VAE execution to fp32 precision on GPU by @eaidova in #432
- Enable OpenVINO export without ONNX export step by @eaidova in #397
- Enable 8-bit weight compression for OpenVINO model by @l-bat in #415
- Add image reshaping for statically reshaped OpenVINO SD models by @echarlaix in #428
- OpenVINO device updates by @helena-intel in #434
- Fix decoder model without cache by @echarlaix in #438
- Fix export by @echarlaix in #439
- Added 8 bit weights compression by default for decoders larger than 1B by @AlexKoff88 in #444
- Add fp16 and int8 conversion to OVModels and export CLI by @echarlaix in #443
model = OVModelForCausalLM.from_pretrained(model_id, load_in_8bit=True)
- Create default attention mask when needed but not provided by @eaidova in #457
- Do not automatically cache models when exporting a model in a temporary directory by @helena-intel in #462
Neural Compressor
- Integrate INC weight-only quantization by @mengniwang95 in #417
- Support num_key_value_heads by @jiqing-feng in #447
- Enable ORT model support to INC quantizer by @echarlaix in #436
- fix INC model loading by @echarlaix in #452
- Fix INC modeling by @echarlaix in #453
- Add starcode past-kv shape for TSModelForCausal class by @changwangss in #371
- Fix transformers v4.35.0 compatibility by @echarlaix in #471
- Fix compatibility for optimum next release by @echarlaix in #460
Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.12.0
v1.11.1: Patch release
- Fix compatibility with
optimum
by @echarlaix in b4663b4
Full Changelog: v1.11.0...v1.11.1
v1.11.0: MPT, TIMM models, VAE image processor
OpenVINO
- Fix SDXL model U-NET component static reshaping by @eaidova in #390
- Allow changing pkv precision by @AlexKoff88 in #393
- Removed pkv history from quantization statistics of decoders by @AlexKoff88 in #394
- Add audio tasks for OpenVINO inference by @helena-intel in #396
- Do not download ONNX model in SD pipeline if not needed by @eaidova in #402
- Enable loading of Text Inversion at runtime for OpenVINO SD pipelines by @sammysun0711 in #400
- Enable Timm models OpenVINO export and inference @sawradip in #404
- Fix OpenVINO Timm models loading by @echarlaix in #413
- Add VAE image processor by @echarlaix in #421
- Enable MPT OpenVINO export and inference by @echarlaix in #425
Neural Compressor
- Fixed ONNX export for
neural-compressor>=2.2.2
by @PenghuiCheng in #409 - Enable ONNX export for INC PTQ model by @echarlaix in #373
- Fix INC CLI by @echarlaix in #426
Full Changelog: https://github.com/huggingface/optimum-intel/commits/v1.11.0
v1.10.1: Patch release
- Set minimum
optimum
version by @echarlaix in #382 - Fix compilation step so that it can be performed before inference by @echarlaix in #384
v1.10.0: Stable Diffusion XL pipelines
Stable Diffusion XL
Enable SD XL OpenVINO export and inference for text-to-image and image-to-image tasks by @echarlaix in #377
from optimum.intel import OVStableDiffusionXLPipeline
model_id = "stabilityai/stable-diffusion-xl-base-0.9"
pipeline = OVStableDiffusionXLPipeline.from_pretrained(model_id, export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
image = pipeline(prompt).images[0]
pipeline.save_pretrained("openvino-sd-xl-base-0.9")
More examples in documentation
Full Changelog: v1.9.0...v1.10.0
v1.9.4: Patch release
- Fix
OVDataLoader
for NNCF quantization aware training fortransformers
> v4.31.0 by @echarlaix in #376
Full Changelog: v1.9.3...v1.9.4
v1.9.3: Patch release
- Improved performance of decoders by @AlexKoff88 #354
- Fix openvino model integration compatibility for optimum > v1.9.0 by @echarlaix in #365
Full Changelog: v1.9.2...v1.9.3
v1.9.2: Patch release
- Fix INC distillation to be compatible with
neural-compressor
v2.2.0 breaking changes by @echarlaix in #338