Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

癿 missing #19

Open
katpatuka opened this issue Jul 6, 2016 · 7 comments
Open

癿 missing #19

katpatuka opened this issue Jul 6, 2016 · 7 comments

Comments

@katpatuka
Copy link

癿 (qié) found in zh.wiki 癿扎乡 and 癿藏镇.

@fifieldt
Copy link
Contributor

fifieldt commented Jul 6, 2016

Character is in the unicode data
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=U%2B767F

Unihan_Readings.txt:U+767F kHanyuPinyin 42643.030:qié

However, it is not in the CEDICT data, so there's no translations for it.

As a start, adding 癿扎乡 and 癿藏镇 to CEDICT:

https://cc-cedict.org/editor/editor.php

would be helpful!

@katpatuka
Copy link
Author

If only I could speak and write Chinese! ;)

@fifieldt
Copy link
Contributor

fifieldt commented Jul 6, 2016

:)

Indeed! I think that even though there are no translations, the pinyin should still show. Looking into why it isn't showing up.

@fifieldt
Copy link
Contributor

fifieldt commented Jul 6, 2016

Found another qie that isn't showing up: 㚗

@fifieldt
Copy link
Contributor

fifieldt commented Jul 6, 2016

Confirmed /convert/?c=癿 is not returning anything in the pinyin array, so problem isn't with the javascript/display but with what's happening in the service,

@fifieldt
Copy link
Contributor

fifieldt commented Jul 6, 2016

Yup, so there's nothing in the database for it

mysql> SELECT definitions.* FROM definitions WHERE (characters_simplified like '癿%' or characters_traditional like '癿%') ORDER BY length(characters_simplified) desc, "primary" desc
-> ;
Empty set (0.00 sec)

@fifieldt
Copy link
Contributor

fifieldt commented Jul 6, 2016

OK, just re-read the data import scripts.

Basically, unless there's a definition for the character in CEDICT, no data is added to the database for a character.

In order to always display at least pinyin for all characters all the time we should change the import script to add a blank definition for characters that don't have one, using the data in unihan_readings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants