[Feature request] Return timestamps for TTS output #257

AlexLavSi · 2025-01-14T20:39:30Z

Hey, would you mind adding some sweet new features? It would be amazing if we could get timelines for sentences in the text, maybe as a JSON file or something.

eginhard · 2025-01-14T21:21:44Z

It's not something we're planning to do ourselves in the near term, but I'd merge PRs adding this. If someone wants to work on it, best to submit a rough plan for feedback here first to agree on a common structure that would work with all Coqui models.

AlexLavSi · 2025-01-15T10:24:47Z

Thank you for your response.
I hope that this improvement will be both needed and liked by someone and that it will be made quickly. I would like to propose this variant of the JSON file (based on my needs):

{ "duration": 500, "timestamps": [ { "text": "bla bla bla 1.", "start_time": 0.0, "end_time": 3.5 }, ... { "text": "bla bla bla end.", "start_time": 480.7, "end_time": 500 } ] }

And text parsing can be done by sentences in the main text.

eginhard added the enhancement New feature or request label Jan 14, 2025

eginhard changed the title ~~[Feature request] add timeline~~ [Feature request] Return timestamps for TTS output Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Return timestamps for TTS output #257

[Feature request] Return timestamps for TTS output #257

AlexLavSi commented Jan 14, 2025

eginhard commented Jan 14, 2025

AlexLavSi commented Jan 15, 2025 •

edited

Loading

[Feature request] Return timestamps for TTS output #257

[Feature request] Return timestamps for TTS output #257

Comments

AlexLavSi commented Jan 14, 2025

eginhard commented Jan 14, 2025

AlexLavSi commented Jan 15, 2025 • edited Loading

AlexLavSi commented Jan 15, 2025 •

edited

Loading