The speed of inference #18

Baiyajing · 2025-01-18T04:05:37Z

I wonder What's the approximate speed of the inference. I use a H100 gpu and inferencing a video costs about half an hour. Is there any way to increase the speed?

kirikorneev · 2025-01-21T18:07:52Z

Hello, we tried to solve the issue.

This is what we did:

To improve inference speed on H100 GPU, we'll make the following changes:

Use torch.compile() to optimize the model

Implement mixed precision training

Increase batch size for parallel processing

Optimize data loading and preprocessing

Add an option for lower precision inference

You can review changes in this commit: clarin-ebtio800090@5cc759c.

Caution

Disclaimer: The concept of solution was created by AI and you should never copy paste this code before you check the correctness of generated code. Solution might not be complete, you should use this code as an inspiration only.

Latta AI seeks to solve problems in open source projects as part of its mission to support developers around the world. Learn more about our mission at https://latta.ai/ourmission . If you no longer want Latta AI to attempt solving issues on your repository, you can block this account.

Is this code going to be merged into the main repository?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The speed of inference #18

The speed of inference #18

Baiyajing commented Jan 18, 2025

kirikorneev commented Jan 21, 2025

The speed of inference #18

The speed of inference #18

Comments

Baiyajing commented Jan 18, 2025

kirikorneev commented Jan 21, 2025