Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert: DAB index #682

Open
chrisvire opened this issue Sep 13, 2024 · 0 comments
Open

Convert: DAB index #682

chrisvire opened this issue Sep 13, 2024 · 0 comments
Assignees

Comments

@chrisvire
Copy link
Member

chrisvire commented Sep 13, 2024

DAB can have multiple reversal indexes. The dictionary has a default or vernacular language which is the language being documented. There is also an analysis language, which is the language used to enter the notes, glosses, and definitions.

There is a reverse index of the glosses back to the head word in the dictionary. The data.sqlite database has an entries table that has id and name fields. The reversal index (which has the file name lexicon-[langid].idx) has a list of glosses in a related language (often the analysis language) with the id from the entries table. It is a tab separated values file (TSV).

Here is the top of an example file:

abon ikan	1471
abu (sisa pembakaran)	1050
ada	2345
adat-istiadat	768
adu	604
aduh	0
advokat	16

There can be multiple words in the vernacular language that map to the reversal language which will be a comma-separated list of indexes.

In the display of the index, there will be the reversal word with the list of vernacular words below it. If there is a non-zero homonym-index, it will be displayed as a superscript after the vernacular word.

There can be multiple reversal indexes.

Convert to a format that is useful to the PWA.

The naive approach would be to convert to a JSON file with a mapping of index word to array of objects with the vernacular index, name, and homonym_index. Performance might not be good with 100K entries.

@chrisvire chrisvire converted this from a draft issue Sep 13, 2024
@chrisvire chrisvire changed the title Convert: index DAB: Convert: index Sep 13, 2024
@chrisvire chrisvire changed the title DAB: Convert: index Convert: DAB index Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In progress
Development

No branches or pull requests

3 participants