You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to suggest to keep the ordering of pronunciation variants for the same word/entry in the same order as they appear on Wiktionary, because that typically reflects how common that pronunciation is (most common first).
The example where I came across this topic is the word "Storys" https://de.wiktionary.org/wiki/Storys
On that Wiktionary page we have this:
IPA: [ˈstɔːʁɪz], [ˈstɔʁis], [ˈstoːʁis], [ˈstɔʁiːsː], [ˈstoːʁiːsː], auch: [ˈʃtɔːʁɪz], [ˈʃtɔʁis], [ˈʃtoːʁis], [ˈʃtɔʁiːsː], [ˈʃtoːʁiːsː]
To my ear (Austrian German), the variants with oː feel very strange. But they come first, because of the final sorting that happens in extract_de_ipa.sh:
I think it would be a good idea to keep the original order. In my use case of this list, I need one pronunciation for a given word. Just always taking the first one in the list would work better after this change (knowing that of course the information on Wiktionary can still be questionable).
The text was updated successfully, but these errors were encountered:
dietmar
added a commit
to dietmar/german-ipa-dict
that referenced
this issue
Oct 25, 2023
Hi and thanks a lot for this repository!
I'd like to suggest to keep the ordering of pronunciation variants for the same word/entry in the same order as they appear on Wiktionary, because that typically reflects how common that pronunciation is (most common first).
The example where I came across this topic is the word "Storys" https://de.wiktionary.org/wiki/Storys
On that Wiktionary page we have this:
IPA: [ˈstɔːʁɪz], [ˈstɔʁis], [ˈstoːʁis], [ˈstɔʁiːsː], [ˈstoːʁiːsː], auch: [ˈʃtɔːʁɪz], [ˈʃtɔʁis], [ˈʃtoːʁis], [ˈʃtɔʁiːsː], [ˈʃtoːʁiːsː]
This repository's
de_dewikt.csv
has:To my ear (Austrian German), the variants with oː feel very strange. But they come first, because of the final sorting that happens in extract_de_ipa.sh:
I think it would be a good idea to keep the original order. In my use case of this list, I need one pronunciation for a given word. Just always taking the first one in the list would work better after this change (knowing that of course the information on Wiktionary can still be questionable).
The text was updated successfully, but these errors were encountered: