When infer() with multiple text, all audio are the same file size (当使用多个文本进行 infer() 时，所有音频的文件大小相同_ #813

ericfarng · 2024-11-02T12:42:28Z

When calling infer() with multiple text, even if the text are very different length, then all the audio files are the same length
I am on Windows, using the demo code provided.

google translate:
当使用多个文本调用 infer() 时，即使文本的长度差别很大，所有音频文件的长度也相同
我在 Windows 上使用提供的演示代码。

fumiama · 2024-11-03T12:12:15Z

The Vocos will decode a batch of audio at the same time, who needs a matrix input, which means that the input length of all audio should match the longest sentence in infer array. User needs to remove the trailing zero values of their output by themself. The program cannot remove it because the stream mode also needs a same output length.

ericfarng changed the title ~~When infer() with multiple text, all audio are the same file size~~ When infer() with multiple text, all audio are the same file size (当使用多个文本进行 infer() 时，所有音频的文件大小相同_ Nov 2, 2024

fumiama added documentation Improvements or additions to documentation help wanted Extra attention is needed algorithm Algorithm improvements & issues labels Nov 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When infer() with multiple text, all audio are the same file size (当使用多个文本进行 infer() 时，所有音频的文件大小相同_ #813

When infer() with multiple text, all audio are the same file size (当使用多个文本进行 infer() 时，所有音频的文件大小相同_ #813

ericfarng commented Nov 2, 2024 •

edited

Loading

fumiama commented Nov 3, 2024

When infer() with multiple text, all audio are the same file size (当使用多个文本进行 infer() 时，所有音频的文件大小相同_ #813

When infer() with multiple text, all audio are the same file size (当使用多个文本进行 infer() 时，所有音频的文件大小相同_ #813

Comments

ericfarng commented Nov 2, 2024 • edited Loading

fumiama commented Nov 3, 2024

ericfarng commented Nov 2, 2024 •

edited

Loading