Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dictionary db question #17

Open
rowanlend opened this issue Aug 4, 2022 · 1 comment
Open

dictionary db question #17

rowanlend opened this issue Aug 4, 2022 · 1 comment

Comments

@rowanlend
Copy link

Hello - saw this initially on HN and was curious about the data.

Apple's Mac comes with a native dictionary that has about ~80,000 words. I opened the sqlite .db file in this project just to do some quick comparisons and noticed that this contains ~52,000 words.

Just curious about the discrepancy after having noticed in the source data you cited as: freeDictionaryAPI with about ~220,000 words. Just a quick spot check I noticed your list doesn't have hyphenated words which I think is great, but having a fairly comprehensive dictionary source would eventually be a great asset in general.

It'd be great to understand how or why you pared down the list of words to what you currently have now.

Either way, thanks for putting this together!

@zehfernandes
Copy link
Owner

Hi @grepsci, good question.

For some reason that I don't know why (but I already open an issue in expo to investigate it expo/expo#18479), it's taking a long time to copy the offline DB to a proper folder in iOS. Because of that, I reduce the file size keeping only the most common words.

So right now, I have two datasets:

  • Android is using a Wikidictionary version with 136,338 words
  • iOS is using a reduced version (the one that is in the repo) with 52,000 words

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants