Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The speed of inference #18

Open
Baiyajing opened this issue Jan 18, 2025 · 2 comments
Open

The speed of inference #18

Baiyajing opened this issue Jan 18, 2025 · 2 comments

Comments

@Baiyajing
Copy link

I wonder What's the approximate speed of the inference. I use a H100 gpu and inferencing a video costs about half an hour. Is there any way to increase the speed?

@kirikorneev
Copy link

Hello, we tried to solve the issue.

This is what we did:

To improve inference speed on H100 GPU, we'll make the following changes:

  1. Use torch.compile() to optimize the model
  2. Implement mixed precision training
  3. Increase batch size for parallel processing
  4. Optimize data loading and preprocessing
  5. Add an option for lower precision inference

You can review changes in this commit: clarin-ebtio800090@5cc759c.

Caution

Disclaimer: The concept of solution was created by AI and you should never copy paste this code before you check the correctness of generated code. Solution might not be complete, you should use this code as an inspiration only.

Latta AI seeks to solve problems in open source projects as part of its mission to support developers around the world. Learn more about our mission at https://latta.ai/ourmission . If you no longer want Latta AI to attempt solving issues on your repository, you can block this account.

Is this code going to be merged into the main repository?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@kirikorneev @Baiyajing and others