-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deployment options #215
Comments
Hi, the cheapest way to deploy such product is to use serverless deployment with spot instance if you can handle the latency or use on-demand instances, you'll create a serverless function with an http trigger that spins up an instance to process the file and return the results and then spins down the instance, that way you will pay only for your usage |
Great! Does it have to be serverless supporting GPU's? Could i use CPU's and mantain quality to the detriment of time? |
As an alternative to a cloud/hosted instance, you can achieve incredible savings by running the product locally for a high volume of processing. GPU's are a must in my opinion. |
@transcriptionstream I do not think i have the option/equipment to run it locally :(. My biggest concern is wether to run an API on a "cheap" GPU like NVIDIA T4 on AWS or other provider, or go for a serverless solution (maybe installing libraries and imports takes a little to much time and thus the accumulated cost rises a lot). My estimation is that i will be processing around 6000-7000 hours a month. Only affordable API at the time is groq cloud, but it does not support diarization (which is a must for me) |
Bummer! If you end up hosting, based on my experience, you're going to want an RTX a6000 to start with for that level of volume, adding from there depending on how close to "real time" you want to achieve. That is not an insignificant amount of audio. |
Hi @MahmoudAshraf97 ! Awesome job!!. I am really impressed on the accuracy for spanish transcriptions, so i am pretty excited that you shared this with the community.
I have been reading, and i haven't really found which would be the most cost efficient way to deploy this kind of service for a high level volume. Any ideas on this matter? Thank you VERY much!
The text was updated successfully, but these errors were encountered: