Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment options #215

Open
cristobal-larach opened this issue Sep 3, 2024 · 5 comments
Open

Deployment options #215

cristobal-larach opened this issue Sep 3, 2024 · 5 comments

Comments

@cristobal-larach
Copy link

Hi @MahmoudAshraf97 ! Awesome job!!. I am really impressed on the accuracy for spanish transcriptions, so i am pretty excited that you shared this with the community.

I have been reading, and i haven't really found which would be the most cost efficient way to deploy this kind of service for a high level volume. Any ideas on this matter? Thank you VERY much!

@MahmoudAshraf97
Copy link
Owner

Hi, the cheapest way to deploy such product is to use serverless deployment with spot instance if you can handle the latency or use on-demand instances, you'll create a serverless function with an http trigger that spins up an instance to process the file and return the results and then spins down the instance, that way you will pay only for your usage

@cristobal-larach
Copy link
Author

Great! Does it have to be serverless supporting GPU's? Could i use CPU's and mantain quality to the detriment of time?

@transcriptionstream
Copy link
Contributor

As an alternative to a cloud/hosted instance, you can achieve incredible savings by running the product locally for a high volume of processing. GPU's are a must in my opinion.

@cristobal-larach
Copy link
Author

@transcriptionstream I do not think i have the option/equipment to run it locally :(. My biggest concern is wether to run an API on a "cheap" GPU like NVIDIA T4 on AWS or other provider, or go for a serverless solution (maybe installing libraries and imports takes a little to much time and thus the accumulated cost rises a lot). My estimation is that i will be processing around 6000-7000 hours a month. Only affordable API at the time is groq cloud, but it does not support diarization (which is a must for me)

@transcriptionstream
Copy link
Contributor

Bummer! If you end up hosting, based on my experience, you're going to want an RTX a6000 to start with for that level of volume, adding from there depending on how close to "real time" you want to achieve. That is not an insignificant amount of audio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants