-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ViTPose is now available in Hugging Face Transformers #157
Comments
Thank you, @NielsRogge! |
Hello, I would like to ask regarding a problem I faced with using VITPose, from hugging face example. I tried to run the example code: import torch
import requests
import numpy as np
from PIL import Image
from transformers import (
AutoProcessor,
RTDetrForObjectDetection,
VitPoseForPoseEstimation,
)
device = "cuda" if torch.cuda.is_available() else "cpu"
url = "http://images.cocodataset.org/val2017/000000000139.jpg"
image = Image.open(requests.get(url, stream=True).raw)
person_image_processor = AutoProcessor.from_pretrained("PekingU/rtdetr_r50vd_coco_o365")
person_model = RTDetrForObjectDetection.from_pretrained("PekingU/rtdetr_r50vd_coco_o365", device_map=device)
inputs = person_image_processor(images=image, return_tensors="pt").to(device)
with torch.no_grad():
outputs = person_model(**inputs)
results = person_image_processor.post_process_object_detection(
outputs, target_sizes=torch.tensor([(image.height, image.width)]), threshold=0.3
)
result = results[0] # take first image results
person_boxes = result["boxes"][result["labels"] == 0]
person_boxes = person_boxes.cpu().numpy()
person_boxes[:, 2] = person_boxes[:, 2] - person_boxes[:, 0]
person_boxes[:, 3] = person_boxes[:, 3] - person_boxes[:, 1]
image_processor = AutoProcessor.from_pretrained("usyd-community/vitpose-base-simple")
model = VitPoseForPoseEstimation.from_pretrained("usyd-community/vitpose-base-simple", device_map=device)
inputs = image_processor(image, boxes=[person_boxes], return_tensors="pt").to(device)
with torch.no_grad():
outputs = model(**inputs)
pose_results = image_processor.post_process_pose_estimation(outputs, boxes=[person_boxes])
image_pose_result = pose_results[0] # results for first image And I face the following problem:
|
Pinging @qubvel here |
Hey @Ashayan97, that's most probably scipy version issue, please try updating it pip install -U scipy |
Dear @qubvel and @NielsRogge, |
Hi folks!
ViTPose (and ViTPose++) are now available in the Transformers library, enabling easy inference in a few lines of code.
Docs: https://huggingface.co/docs/transformers/v4.48.0/en/model_doc/vitpose
Checkpoints can be found here.
Demo (on both images and video): https://huggingface.co/spaces/hysts/ViTPose-transformers.
Can be relevant for #133 #26 #139 #135 #111
The text was updated successfully, but these errors were encountered: