-
Notifications
You must be signed in to change notification settings - Fork 512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TTS Hallucinations in shorter phrases #1695
Comments
Could you describe in detail how you tried it? Do you first generate
and then you invoke a second call to generate
or
? |
in the english example:
then tried 3 times the text:
i noticed that 2 of 3 times it adds "sir" or "si" after the "hello" ( "hello sir" or "hello si") meanwhile in the french exaple it adds stuff the first time! |
Can you reproduce it with our APK? I think there is a bug in your apk if what you described can be reproduced with your APK. |
you are right, it doesn't happen on your apk, the problem for me isn't just in the apk but even inside unity, using the code i shared earlier in the other thread. i am in need for french models in particular, the stuff they add at the end is not normal, and there are models that don't work at all on short sentences (they generate just distorted audio) like fr-FR_mls_medium.onnx. |
I just tried with your apk and I think there is a bug in your code. Please make sure you have overwritten the buffer for the previous call . Don't overwrite the buffer partially. |
Please don't use models containing I think I have deleted all models containing |
Or make sure you have cleared the buffer containing samples of the previous call before you play the samples of the current text. |
the buffer is cleared already:
also this is if we are talking about the apk, but in the french version it's different, would you like me to provide an apk for french as well? |
Could you describe the differences? Does the APK for French use a different set of code from the APK for English? |
no the same, just a different model, with different tokens file, what i mean by different, is the issue |
from the first time i generate an audio in french it hallucinates other stuff in the end of the text, so it's not a buffer issue for french, i just mentioned the english apk thinking it was related |
I don't see any issues from your posted code. |
Is each sentence processed sequentially, not in parallel? |
yes, sequentially, since the tts functions don't support streaming right now, it was the only option to make the generation faster |
No, we support passing a callback to C++. Inside C++, it processes the text sentence by sentence. After processing a sentence, the callback is invoked with the generated samples for this sentence. Please try our Android APK first. You will find it plays almost immediately no matter how long the given text is. Remeber to use the TTS APK, not the TTS Engine APK. |
Could you enable the debug in tts model config and post the logs when you generate samples? sherpa-onnx/sherpa-onnx/c-api/c-api.h Line 916 in 0cb2db3
|
i don't get any logs, that's the weird part, unity is not showing me any logs except the ones i made! am i doing something wrong?
|
IIRC, you posted some error logs in your first issue in the other session. How did you get them? |
from log cat that was in an apk using logcat, for some reason unity doesn't show the errors directly, hold tight, i will use log cat again |
it's sherpa that's logging that yellow raw text warning, but i am unable to get its stack trace |
Please show the code for |
good morning, thank you for your reply!
|
hello @csukuangfj, any solution yet? |
The code looks correct. Can you reproduce it with our example code in the dotnet-examples folder? |
sorry but i wasn't able to do that, i kinda lack experience of coding outside of unity |
hello @csukuangfj , i managed to fix the error by modifying the generate function as follows,
i wasn't able to modify the nuget package in unity so i had to replicate the offlineTTS class in unity and then modify it, i hope you modify to the nuget package generate function like i did to avoid this error for other devs (c# .net example). thank you for all your help and responsiveness! |
See also k2-fsa#1695 (comment) We need to place a 0 at the end of the buffer.
Great to hear you fixed it! Please see #1701 |
See also #1695 (comment) We need to place a 0 at the end of the buffer.
Hello @adem-rguez @csukuangfj ! First of all, sorry for being a newbie, I'm just getting started. I've seen this thread and I can't resist asking if you can help me because, even with the documentation and how detailed this thread is, I still can't get Unity to load the library. Although it's able to find it, I always get this error. I have downloaded the necessary I have declared the functions of the library. `using SherpaOnnx; public static class Sherpita
} I have a script to test the functionality in which I call a function. `using System; namespace SherpaOnnx
#if UNITY_STANDALONE_WIN || UNITY_EDITOR
#endif
} `using System; namespace SherpaOnnx
} In summary, I only have the .so files and those three scripts. Is there anything else I might be missing? Is there something wrong with the scripts that is preventing the library from loading correctly when I call a function? Thanks! |
i am running tts sherpa-onnx in unity (c#), i am having a problem where in the shorter sentences the generated audio tends to add extra audio containing gibberish at the end..
example long sentence (works fine) : "Bonjour monsieur, comment allez-vous aujourd’hui ? J’espère que vous passez une excellente journée !"
audio file: long sentence example
example short sentence (adds gibberish at the end): bonjour monsieur
audio file: short sentence example
in these examples i used umpc voice for french, but the same issues exists on other models.
for example on the libritts_r model when you generate "hello sir" it works, but when you generate "hello" immediately after it, it adds the previous text sometimes or part of it "hello sir" or "hello si".
The text was updated successfully, but these errors were encountered: