You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi!
I have appr. 1.5 hours of audio voice at 44Khz and like to train a usable model from it. I don't want to retrain, as the pre-trained checkpoints are all 22Khz, sounding muddy and not that good.
I tried training from scratch, specifying the correct sampling_rate of 44100. Reached 2000 epochs, but the inferred audio was way too fast, skipping words in the process.
What should I modify or patch in to make this work?
thanks!
The text was updated successfully, but these errors were encountered:
Hi!
I have appr. 1.5 hours of audio voice at 44Khz and like to train a usable model from it. I don't want to retrain, as the pre-trained checkpoints are all 22Khz, sounding muddy and not that good.
I tried training from scratch, specifying the correct sampling_rate of 44100. Reached 2000 epochs, but the inferred audio was way too fast, skipping words in the process.
What should I modify or patch in to make this work?
thanks!
The text was updated successfully, but these errors were encountered: