You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to quantize a Wav2Lip PyTorch model. When I run the code using fbgemm backend. I run into the following error.
AssertionError: Per channel weight observer is not supported yet for ConvTranspose{n}d.
The model is using Conv2DTranspose layers. As per my understanding it should work for other layers. When I change the backend engine to "qnnkpg" that also ran into same problem. but as per "qnnpkg" git repo, Conv2DTranspose is not supported yet. How can I use this "fbgemm" backend to quantize my target model? Any helping material shall be highly appreciated. I am currently using following code.
import torch
from models import Wav2Lip
from quantized import load_model
device = "cuda" if torch.cuda.is_available() else "cpu"
backend = "fbgemm"
# backend = "qnnpack"
checkpoint_path = "model_data/wav2lip_gan.pth"
model = load_model(checkpoint_path)
model.eval()
# Use 'fbgemm' for server inference and 'qnnpack' for mobile inference
# backend = "fbgemm" # replaced with qnnpack causing much worse inference speed for quantized model on this notebook
model.qconfig = torch.quantization.get_default_qconfig(backend)
torch.backends.quantized.engine = backend
quantized_model = torch.quantization.quantize_dynamic(
model, qconfig_spec={torch.nn.Linear}, dtype=torch.qint8
)
# scripted_quantized_model = torch.jit.script(quantized_model)
quantized_model.save("wav2lip_gan_quantized.pth")
torch.save(quantized_model, "wav2lip_gan_quantized_int8.pth")
The text was updated successfully, but these errors were encountered:
I am trying to quantize a Wav2Lip PyTorch model. When I run the code using fbgemm backend. I run into the following error.
AssertionError: Per channel weight observer is not supported yet for ConvTranspose{n}d.
The model is using Conv2DTranspose layers. As per my understanding it should work for other layers. When I change the backend engine to "qnnkpg" that also ran into same problem. but as per "qnnpkg" git repo, Conv2DTranspose is not supported yet. How can I use this "fbgemm" backend to quantize my target model? Any helping material shall be highly appreciated. I am currently using following code.
The text was updated successfully, but these errors were encountered: