Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update chat_with_video with new models #3

Closed
movchan74 opened this issue Oct 18, 2024 · 3 comments · Fixed by #4
Closed

Update chat_with_video with new models #3

movchan74 opened this issue Oct 18, 2024 · 3 comments · Fixed by #4
Assignees

Comments

@movchan74
Copy link
Contributor

movchan74 commented Oct 18, 2024

  • New visual captioning model

  • Whisper Turbo model

  • other updates, if any

  • speed test

@HRashidi HRashidi linked a pull request Oct 31, 2024 that will close this issue
@HRashidi
Copy link
Contributor

HRashidi commented Nov 1, 2024

Speed:
for processing 1000 images:
Blip: 68 seconds (0.068 for each image)
qwen2-2b: 501 seconds (0.501 for each image)

@appoose
Copy link

appoose commented Nov 1, 2024

We need to investigate the speed. The current qwen speed seems to be too low.

@HRashidi
Copy link
Contributor

The data for blip
max_ongoing_requests=500 and batch_size=500
1076.3250040096232 token/secs
0.03344714641571045 per image

max_ongoing_requests=500 and batch_size=1
139.1819998132871 token/secs
0.25865 per image

max_ongoing_requests=1000 and batch_size=2
286.45805220138624 token/secs
0.1256728506088257 per image

Using built in batch of vllm requests
514.784160764785 (tokens/sec)
0.21983970880508422 (per image)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants