MiniCPM unable to run inference #1165

plischwe · 2025-02-17T18:40:05Z

Hi, I followed the tutorial here to convert and run multimodal inference using phi-3-vision: https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/phi-3-vision/phi-3-vision.ipynb

The code I put into a python script which looks like this:

from optimum.intel.openvino import OVModelForVisualCausalLM, OVWeightQuantizationConfig
from PIL import Image
from transformers import AutoProcessor, TextStreamer

model_dir = "benchmarking/multimodal/INT4/phi3-128k-vision"
model_dir = "benchmarking/multimodal/INT4/MiniCPM"
image_path = "cat.png"

model = OVModelForVisualCausalLM.from_pretrained(model_dir, device="CPU", trust_remote_code=True)

image = Image.open(image_path)
print("image size: ", image.size)

messages = [
    {"role": "user", "content": "<|image_1|>\nWhat is unusual on this picture?"},
]

processor = AutoProcessor.from_pretrained(model_dir, trust_remote_code=True)

prompt = processor.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = processor(prompt, [image], return_tensors="pt")

generation_args = {"max_new_tokens": 100, "do_sample": False, "streamer": TextStreamer(processor.tokenizer, skip_prompt=True, skip_special_tokens=True)}

print("Answer:")
generate_ids = model.generate(**inputs, eos_token_id=processor.tokenizer.eos_token_id, **generation_args)

But it seems that when I replace the local int4 path for phi-3-vision with the local int4 path of the 'openbmb/MiniCPM-V-2_6' that I also exported to int4 using the same command to compress phi-3-vision, the inference no longer works. Although I see MiniCPM-V-2_6 is a supported model in the modeling_vision_language.py but the error I am seeing looks like this:

Traceback (most recent call last):
  File "/home/plischwe/OVModelForVisualCausalLM_test.py", line 27, in <module>
    inputs = processor(prompt, [image], return_tensors="pt")
  File "/root/.cache/huggingface/modules/transformers_modules/openbmb/MiniCPM-V-2_6/4719557d673e9e2b4b3f083801626098f51441a8/processing_minicpmv.py", line 67, in __call__
    return self._convert_images_texts_to_inputs(image_inputs, text, max_slice_nums=max_slice_nums, use_image_id=use_image_id, max_length=max_length, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/openbmb/MiniCPM-V-2_6/4719557d673e9e2b4b3f083801626098f51441a8/processing_minicpmv.py", line 153, in _convert_images_texts_to_inputs
    assert len(image_tags) == len(image_sizes[index])

Shouldn't the plug and play of models be necessary to make the reusability of code easier - or is there something I am missing?

The text was updated successfully, but these errors were encountered:

eaidova · 2025-02-21T06:22:46Z

Error comes not from model class that you reference, it happens on early stage where you preprocess inputs.

this is issue with preprocessing code, not with model class. MiniCPM-2.6V uses different format for image and chat template applying than phi-3-vision and as the result you got assert that number of image tags in tokenized input prompt is different from expected. Unfortunantly we can do nothing with that as it is part of original model code provided by its authors (optimum-intel just reuse the same preprocessing that defined for original model) and it is acceptable that independent models code may have some differences

I recommend to use preprocess_input helper in OVModelForVisualCausalLM for preparing model input instead of manual preprocessing (will work for both phi-3 and minicom models)

model = OVModelForVisualCausalLM.from_pretrained(model_dir, device="CPU", trust_remote_code=True)
inputs = model.preprocess_input(image=image, text="What is unusual on this picture?, procesor=processor, tokenizer=processor.tokenizer, config=model.config)

model.generate(**inputs, **generation_args)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MiniCPM unable to run inference #1165

MiniCPM unable to run inference #1165

plischwe commented Feb 17, 2025

eaidova commented Feb 21, 2025

MiniCPM unable to run inference #1165

MiniCPM unable to run inference #1165

Comments

plischwe commented Feb 17, 2025

eaidova commented Feb 21, 2025