You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, it seems some Japanese words are incorrectly recognized.
If we say "こんにちはKonnichiwa", it is recognized as 今日は by Vosk SDK ( Which means "today is…(Kyou ha)" .)
"今日は" doesn't mean "こんにちは"
("今日は" can be pronounced "Konnichiwa", but usually pronounced "Kyouwa".
It's expected that the words output is "こんにちは", not 今日は.
The problem seems to be that inappropriate hiragana-to-kanji conversion is taking place. However, the Vosk model does not output hiragana and then convert it to kanji, but outputs kanji from the beginning, hence making this issue happens.
There's a previous issue that we think being related here: #1047
We'd like to attach a video demonstrating the issue here:
screen-20240611-151249.1.mp4
Could you take a look into this issue when you have some free time?
We really appreciate your help.
The text was updated successfully, but these errors were encountered:
Hi, I'm having a similar issue in a Unity app I'm working on, and I agree that it may be useful to have a Japanese model that produces the phonetics instead of the kanji, especially in some cases where the words (kanji or borrowed words in katakana) are not present in the word graph or have very minor deviations from the existing words in the word graph.
One example of these deviations I've encountered is a word like "固まった" (かたまった), which itself doesn't appear in the word graph for either the big or small models, but a similar variation, "固まっ" does appear, but then I have to worry about what the model will do with the leftover "た" if another word comes after it.
I would really appreciate it as well if it were possible to have an option to output the phonetics of what the user said instead of the kanji conversion.
@nshmyrev We tested Japanese small model, which is downloadable here: https://alphacephei.com/vosk/models/vosk-model-small-ja-0.22.zip, in an Android app.
However, it seems some Japanese words are incorrectly recognized.
If we say "こんにちはKonnichiwa", it is recognized as 今日は by Vosk SDK ( Which means "today is…(Kyou ha)" .)
"今日は" doesn't mean "こんにちは"
("今日は" can be pronounced "Konnichiwa", but usually pronounced "Kyouwa".
It's expected that the words output is "こんにちは", not 今日は.
The problem seems to be that inappropriate hiragana-to-kanji conversion is taking place. However, the Vosk model does not output hiragana and then convert it to kanji, but outputs kanji from the beginning, hence making this issue happens.
There's a previous issue that we think being related here: #1047
We'd like to attach a video demonstrating the issue here:
screen-20240611-151249.1.mp4
Could you take a look into this issue when you have some free time?
We really appreciate your help.
The text was updated successfully, but these errors were encountered: