Feel free to join my Discord Server to discuss this model!
![Screenshot 2024-07-15 at 14 05 42](https://private-user-images.githubusercontent.com/400659/348855373-0b2aacfd-8b53-4370-9d42-d6733b46957c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkzNDAwOTMsIm5iZiI6MTczOTMzOTc5MywicGF0aCI6Ii80MDA2NTkvMzQ4ODU1MzczLTBiMmFhY2ZkLThiNTMtNDM3MC05ZDQyLWQ2NzMzYjQ2OTU3Yy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjEyJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIxMlQwNTU2MzNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0xZWMzMTVkMjAxZTM1Yjg2YTRlOGU5ODQ0Mjk0MTAwYTU3M2ZkY2E2NjczN2UyOThhNDAwZWJlY2IwOWM0NzkyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.wbObV3iBKwCTXCJqxbg1uJoec9GqQ80OXhW8RgkFBWw)
A foundational model for voice audio, could be effectively fine-tuned on a single GPU to implement text-to-speech, text enchancement and diarization. Based on original Speechflow Paper: Generative Pre-training for Speech with Flow Matching
- Supervoice Enhance - cleanup of background noise
MIT