Update chat_with_video with new models #3

movchan74 · 2024-10-18T08:20:35Z

HRashidi · 2024-11-01T09:03:24Z

Speed:
for processing 1000 images:
Blip: 68 seconds (0.068 for each image)
qwen2-2b: 501 seconds (0.501 for each image)

appoose · 2024-11-01T09:06:28Z

We need to investigate the speed. The current qwen speed seems to be too low.

HRashidi · 2024-11-13T09:13:27Z

The data for blip
max_ongoing_requests=500 and batch_size=500
1076.3250040096232 token/secs
0.03344714641571045 per image

max_ongoing_requests=1000 and batch_size=2
286.45805220138624 token/secs
0.1256728506088257 per image

Using built in batch of vllm requests
514.784160764785 (tokens/sec)
0.21983970880508422 (per image)

movchan74 assigned HRashidi Oct 18, 2024

HRashidi linked a pull request Oct 31, 2024 that will close this issue

add new models and fix sessions #4

Merged

HRashidi closed this as completed in #4 Nov 1, 2024