diff --git a/.github/ISSUE_TEMPLATE/feature-request.md b/.github/ISSUE_TEMPLATE/feature-request.md index 47a68f5f5..6620429da 100644 --- a/.github/ISSUE_TEMPLATE/feature-request.md +++ b/.github/ISSUE_TEMPLATE/feature-request.md @@ -1,18 +1,11 @@ -______________________________________________________________________ +--- -name: Feature request -about: Suggest/request a feature (new format, option, parameter etc) -title: '' -labels: +name: Feature request about: Suggest/request a feature (new format, option, parameter etc) title: '' labels: -______________________________________________________________________ +--- -**Is your feature request related to a problem? Please describe.** -A clear and concise description of what the problem is. +**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. -**Describe the solution you'd like** -A clear and concise description of what you want to happen. +**Describe the solution you'd like** A clear and concise description of what you want to happen. -**Provide links and sample file(s)** -Provide links to the official website and/or download page of the related software or format. -Provide sample file(s) for the format/feature you want to be supported. Attach the file(s) if you can. If no sample file is publicly downloadable due to copyright, please mention and explain. +**Provide links and sample file(s)** Provide links to the official website and/or download page of the related software or format. Provide sample file(s) for the format/feature you want to be supported. Attach the file(s) if you can. If no sample file is publicly downloadable due to copyright, please mention and explain. diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 472b6157c..0c7df53f0 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,6 +1,6 @@ The PyGlossary code of conduct is derived from [The Ruby Community Conduct Guideline](https://www.ruby-lang.org/en/conduct/). -- Participants will be tolerant of opposing views. -- Participants must ensure that their language and actions are free of personal attacks and disparaging personal remarks. -- When interpreting the words and actions of others, participants should always assume good intentions. -- Behavior that can be reasonably considered harassment will not be tolerated. +- Participants will be tolerant of opposing views. +- Participants must ensure that their language and actions are free of personal attacks and disparaging personal remarks. +- When interpreting the words and actions of others, participants should always assume good intentions. +- Behavior that can be reasonably considered harassment will not be tolerated. diff --git a/README.md b/README.md index 136de8063..23b7abb24 100644 --- a/README.md +++ b/README.md @@ -1,269 +1,251 @@ -# PyGlossary +PyGlossary +========== A tool for converting dictionary files aka glossaries. -The primary purpose is to be able to use our offline glossaries in any Open -Source dictionary we like on any OS/device. +The primary purpose is to be able to use our offline glossaries in any Open Source dictionary we like on any OS/device. -There are countless formats, and my time is limited, so I implement formats that -seem more useful for myself, or for Open Source community. Also diversity of -languages is taken into account. Pull requests are welcome. +There are countless formats, and my time is limited, so I implement formats that seem more useful for myself, or for Open Source community. Also diversity of languages is taken into account. Pull requests are welcome. -## Screenshots +Screenshots +----------- Linux - Gtk3-based interface -______________________________________________________________________ +--- Windows - Tkinter-based interface -______________________________________________________________________ +--- Linux - command-line interface -______________________________________________________________________ +--- Android Termux - interactive command-line interface -## Supported formats - -| Format | | Extension | Read | Write | -| ------------------------------------------------------- | :-: | :-------------: | :--: | :---: | -| [Aard 2 (slob)](./doc/p/aard2_slob.md) | πŸ”’ | .slob | βœ” | βœ” | -| [ABBYY Lingvo DSL](./doc/p/dsl.md) | πŸ“ | .dsl | βœ” | | -| [Almaany.com](./doc/p/almaany.md) (SQLite3, Arabic) | πŸ”’ | .db | βœ” | | -| [AppleDict Binary](./doc/p/appledict_bin.md) | πŸ“ | .dictionary | βœ” | ❌ | -| [AppleDict Source](./doc/p/appledict.md) | πŸ“ | | | βœ” | -| [Babylon BGL](./doc/p/babylon_bgl.md) | πŸ”’ | .bgl | βœ” | ❌ | -| [cc-kedict](./doc/p/cc_kedict.md) (Korean) | πŸ“ | | βœ” | ❌ | -| [CSV](./doc/p/csv.md) | πŸ“ | .csv | βœ” | βœ” | -| [Dict.cc](./doc/p/dict_cc.md) (SQLite3, German) | πŸ”’ | .db | βœ” | | -| [DICT.org / Dictd server](./doc/p/dict_org.md) | πŸ“ | (πŸ“.index) | βœ” | βœ” | -| [DICT.org / dictfmt source](./doc/p/dict_org_source.md) | πŸ“ | (.dtxt) | | βœ” | -| [dictunformat output file](./doc/p/dictunformat.md) | πŸ“ | (.dictunformat) | βœ” | | -| [DictionaryForMIDs](./doc/p/dicformids.md) | πŸ“ | (πŸ“.mids) | βœ” | βœ” | -| [DigitalNK](./doc/p/digitalnk.md) (SQLite3, N-Korean) | πŸ”’ | .db | βœ” | | -| [DIKT JSON](./doc/p/dikt_json.md) | πŸ“ | (.json) | | βœ” | -| [EDICT2 (CEDICT)](./doc/p/edict2.md) (Chinese) | πŸ“ | (.u8) | βœ” | ❌ | -| [EDLIN](./doc/p/edlin.md) | πŸ“ | .edlin | βœ” | βœ” | -| [EPUB-2 E-Book](./doc/p/epub2.md) | πŸ“¦ | .epub | ❌ | βœ” | -| [FreeDict](./doc/p/freedict.md) | πŸ“ | .tei | βœ” | ❌ | -| [Gettext Source](./doc/p/gettext_po.md) | πŸ“ | .po | βœ” | βœ” | -| [HTML Directory (by file size)](./doc/p/html_dir.md) | πŸ“ | | ❌ | βœ” | -| [JMDict](./doc/p/jmdict.md) (Japanese) | πŸ“ | | βœ” | ❌ | -| [JSON](./doc/p/json.md) | πŸ“ | .json | | βœ” | -| [Kobo E-Reader Dictionary](./doc/p/kobo.md) | πŸ“¦ | .kobo.zip | ❌ | βœ” | -| [Kobo E-Reader Dictfile](./doc/p/kobo_dictfile.md) | πŸ“ | .df | βœ” | βœ” | -| [Lingoes Source](./doc/p/lingoes_ldf.md) | πŸ“ | .ldf | βœ” | βœ” | -| [Mobipocket E-Book](./doc/p/mobi.md) | πŸ”’ | .mobi | ❌ | βœ” | -| [Octopus MDict](./doc/p/octopus_mdict.md) | πŸ”’ | .mdx | βœ” | ❌ | -| [QuickDic version 6](./doc/p/quickdic6.md) | πŸ“ | .quickdic | βœ” | βœ” | -| [SQL](./doc/p/sql.md) | πŸ“ | .sql | ❌ | βœ” | -| [StarDict](./doc/p/stardict.md) | πŸ“ | (πŸ“.ifo) | βœ” | βœ” | -| [StarDict Textual File](./doc/p/stardict_textual.md) | πŸ“ | (.xml) | βœ” | βœ” | -| [Tabfile](./doc/p/tabfile.md) | πŸ“ | .txt, .tab | βœ” | βœ” | -| [Wiktextract](./doc/p/wiktextract.md) | πŸ“ | .jsonl | βœ” | | -| [Wordset.org](./doc/p/wordset.md) | πŸ“ | | βœ” | | -| [XDXF](./doc/p/xdxf.md) | πŸ“ | .xdxf | βœ” | ❌ | -| [Yomichan](./doc/p/yomichan.md) | πŸ“¦ | (.zip) | | βœ” | -| [Zim (Kiwix)](./doc/p/zim.md) | πŸ”’ | .zim | βœ” | | +Supported formats +----------------- + +| Format | | Extension | Read | Write | +|---------------------------------------------------------|:--:|:---------------:|:----:|:-----:| +| [Aard 2 (slob)](./doc/p/aard2_slob.md) | πŸ”’ | .slob | βœ” | βœ” | +| [ABBYY Lingvo DSL](./doc/p/dsl.md) | πŸ“ | .dsl | βœ” | | +| [Almaany.com](./doc/p/almaany.md) (SQLite3, Arabic) | πŸ”’ | .db | βœ” | | +| [AppleDict Binary](./doc/p/appledict_bin.md) | πŸ“ | .dictionary | βœ” | ❌ | +| [AppleDict Source](./doc/p/appledict.md) | πŸ“ | | | βœ” | +| [Babylon BGL](./doc/p/babylon_bgl.md) | πŸ”’ | .bgl | βœ” | ❌ | +| [cc-kedict](./doc/p/cc_kedict.md) (Korean) | πŸ“ | | βœ” | ❌ | +| [CSV](./doc/p/csv.md) | πŸ“ | .csv | βœ” | βœ” | +| [Dict.cc](./doc/p/dict_cc.md) (SQLite3, German) | πŸ”’ | .db | βœ” | | +| [DICT.org / Dictd server](./doc/p/dict_org.md) | πŸ“ | (πŸ“.index) | βœ” | βœ” | +| [DICT.org / dictfmt source](./doc/p/dict_org_source.md) | πŸ“ | (.dtxt) | | βœ” | +| [dictunformat output file](./doc/p/dictunformat.md) | πŸ“ | (.dictunformat) | βœ” | | +| [DictionaryForMIDs](./doc/p/dicformids.md) | πŸ“ | (πŸ“.mids) | βœ” | βœ” | +| [DigitalNK](./doc/p/digitalnk.md) (SQLite3, N-Korean) | πŸ”’ | .db | βœ” | | +| [DIKT JSON](./doc/p/dikt_json.md) | πŸ“ | (.json) | | βœ” | +| [EDICT2 (CEDICT)](./doc/p/edict2.md) (Chinese) | πŸ“ | (.u8) | βœ” | ❌ | +| [EDLIN](./doc/p/edlin.md) | πŸ“ | .edlin | βœ” | βœ” | +| [EPUB-2 E-Book](./doc/p/epub2.md) | πŸ“¦ | .epub | ❌ | βœ” | +| [FreeDict](./doc/p/freedict.md) | πŸ“ | .tei | βœ” | ❌ | +| [Gettext Source](./doc/p/gettext_po.md) | πŸ“ | .po | βœ” | βœ” | +| [HTML Directory (by file size)](./doc/p/html_dir.md) | πŸ“ | | ❌ | βœ” | +| [JMDict](./doc/p/jmdict.md) (Japanese) | πŸ“ | | βœ” | ❌ | +| [JSON](./doc/p/json.md) | πŸ“ | .json | | βœ” | +| [Kobo E-Reader Dictionary](./doc/p/kobo.md) | πŸ“¦ | .kobo.zip | ❌ | βœ” | +| [Kobo E-Reader Dictfile](./doc/p/kobo_dictfile.md) | πŸ“ | .df | βœ” | βœ” | +| [Lingoes Source](./doc/p/lingoes_ldf.md) | πŸ“ | .ldf | βœ” | βœ” | +| [Mobipocket E-Book](./doc/p/mobi.md) | πŸ”’ | .mobi | ❌ | βœ” | +| [Octopus MDict](./doc/p/octopus_mdict.md) | πŸ”’ | .mdx | βœ” | ❌ | +| [QuickDic version 6](./doc/p/quickdic6.md) | πŸ“ | .quickdic | βœ” | βœ” | +| [SQL](./doc/p/sql.md) | πŸ“ | .sql | ❌ | βœ” | +| [StarDict](./doc/p/stardict.md) | πŸ“ | (πŸ“.ifo) | βœ” | βœ” | +| [StarDict Textual File](./doc/p/stardict_textual.md) | πŸ“ | (.xml) | βœ” | βœ” | +| [Tabfile](./doc/p/tabfile.md) | πŸ“ | .txt, .tab | βœ” | βœ” | +| [Wiktextract](./doc/p/wiktextract.md) | πŸ“ | .jsonl | βœ” | | +| [Wordset.org](./doc/p/wordset.md) | πŸ“ | | βœ” | | +| [XDXF](./doc/p/xdxf.md) | πŸ“ | .xdxf | βœ” | ❌ | +| [Yomichan](./doc/p/yomichan.md) | πŸ“¦ | (.zip) | | βœ” | +| [Zim (Kiwix)](./doc/p/zim.md) | πŸ”’ | .zim | βœ” | | Legend: -- πŸ“ Directory -- πŸ“ Text file -- πŸ“¦ Package/archive file -- πŸ”’ Binary file -- βœ” Supported -- ❌ Will not be supported - -**Note**: SQLite-based formats are not detected by extension (`.db`); -So you need to select the format (with UI or `--read-format` flag). -**Also don't confuse SQLite-based formats with [SQLite mode](#sqlite-mode).** - -## Requirements - -PyGlossary requires **Python 3.10 or higher**, and works in practically all -modern operating systems. While primarily designed for *GNU/Linux*, it works -on *Windows*, *Mac OS X* and other Unix-based operating systems as well. - -As shown in the screenshots, there are multiple User Interface types (multiple -ways to use the program). - -- **Gtk3-based interface**, uses [PyGI (Python Gobject Introspection)](http://pygobject.readthedocs.io/en/latest/getting_started.html) - You can install it on: - - - Debian/Ubuntu: `apt install python3-gi python3-gi-cairo gir1.2-gtk-3.0` - - openSUSE: `zypper install python3-gobject gtk3` - - Fedora: `dnf install pygobject3 python3-gobject gtk3` - - ArchLinux: - - `pacman -S python-gobject gtk3` - - https://aur.archlinux.org/packages/pyglossary/ - - Mac OS X: `brew install pygobject3 gtk+3` - - Nix / NixOS: `nix-shell -p pkgs.gobject-introspection python38Packages.pygobject3 python38Packages.pycairo` - -- **Tkinter-based interface**, works in the lack of Gtk. Specially on - Windows where Tkinter library is installed with the Python itself. - You can also install it on: - - - Debian/Ubuntu: `apt-get install python3-tk tix` - - openSUSE: `zypper install python3-tk tix` - - Fedora: `yum install python3-tkinter tix` - - Mac OS X: read - - Nix / NixOS: `nix-shell -p python38Packages.tkinter tix` - -- **Command-line interface**, works in all operating systems without - any specific requirements, just type: - - `python3 main.py --help` - - - **Interactive command-line interface** - - Requires: `pip install prompt_toolkit` - - Perfect for mobile devices (like Termux on Android) where no GUI is available - - Automatically selected if output file argument is not passed **and** one of these: - - On Linux and `$DISPLAY` environment variable is empty or not set - - For example when you are using a remote Linux machine over SSH - - On Mac and no `tkinter` module is found - - Manually select with `--cmd` or `--ui=cmd` - - Minimally: `python3 main.py --cmd` - - You can still pass input file, or any flag/option - - If both input and output files are passed, non-interactive cmd ui will be default - - If you are writing a script, you can pass `--no-interactive` to force disable interactive ui - - Then you have to pass both input and output file arguments - - Don't forget to use *Up/Down* or *Tab* keys in prompts! - - Up/Down key shows you recent values you have used - - Tab key shows available values/options - - You can press Control+C (on Linux/Windows) at any prompt to exit - -## UI (User Interface) selection - -When you run PyGlossary without any command-line arguments or options/flags, -PyGlossary tries to find PyGI and open the Gtk3-based interface. If it fails, -it tries to find Tkinter and open the Tkinter-based interface. If that fails, -it tries to find `prompt_toolkit` and run interactive command-line interface. -And if none of these libraries are found, it exits with an error. +- πŸ“ Directory +- πŸ“ Text file +- πŸ“¦ Package/archive file +- πŸ”’ Binary file +- βœ” Supported +- ❌ Will not be supported + +**Note**: SQLite-based formats are not detected by extension (`.db`); So you need to select the format (with UI or `--read-format` flag). **Also don't confuse SQLite-based formats with [SQLite mode](#sqlite-mode).** + +Requirements +------------ + +PyGlossary requires **Python 3.10 or higher**, and works in practically all modern operating systems. While primarily designed for *GNU/Linux*, it works on *Windows*, *Mac OS X* and other Unix-based operating systems as well. + +As shown in the screenshots, there are multiple User Interface types (multiple ways to use the program). + +- **Gtk3-based interface**, uses [PyGI (Python Gobject Introspection)](http://pygobject.readthedocs.io/en/latest/getting_started.html) You can install it on: + + - Debian/Ubuntu: `apt install python3-gi python3-gi-cairo gir1.2-gtk-3.0` + - openSUSE: `zypper install python3-gobject gtk3` + - Fedora: `dnf install pygobject3 python3-gobject gtk3` + - ArchLinux: + - `pacman -S python-gobject gtk3` + - https://aur.archlinux.org/packages/pyglossary/ + - Mac OS X: `brew install pygobject3 gtk+3` + - Nix / NixOS: `nix-shell -p pkgs.gobject-introspection python38Packages.pygobject3 python38Packages.pycairo` + +- **Tkinter-based interface**, works in the lack of Gtk. Specially on Windows where Tkinter library is installed with the Python itself. You can also install it on: + + - Debian/Ubuntu: `apt-get install python3-tk tix` + - openSUSE: `zypper install python3-tk tix` + - Fedora: `yum install python3-tkinter tix` + - Mac OS X: read https://www.python.org/download/mac/tcltk/ + - Nix / NixOS: `nix-shell -p python38Packages.tkinter tix` + +- **Command-line interface**, works in all operating systems without any specific requirements, just type: + +`python3 main.py --help` + +- **Interactive command-line interface** + - Requires: `pip install prompt_toolkit` + - Perfect for mobile devices (like Termux on Android) where no GUI is available + - Automatically selected if output file argument is not passed **and** one of these: + - On Linux and `$DISPLAY` environment variable is empty or not set + - For example when you are using a remote Linux machine over SSH + - On Mac and no `tkinter` module is found + - Manually select with `--cmd` or `--ui=cmd` + - Minimally: `python3 main.py --cmd` + - You can still pass input file, or any flag/option + - If both input and output files are passed, non-interactive cmd ui will be default + - If you are writing a script, you can pass `--no-interactive` to force disable interactive ui + - Then you have to pass both input and output file arguments + - Don't forget to use *Up/Down* or *Tab* keys in prompts! + - Up/Down key shows you recent values you have used + - Tab key shows available values/options + - You can press Control+C (on Linux/Windows) at any prompt to exit + +UI (User Interface) selection +----------------------------- + +When you run PyGlossary without any command-line arguments or options/flags, PyGlossary tries to find PyGI and open the Gtk3-based interface. If it fails, it tries to find Tkinter and open the Tkinter-based interface. If that fails, it tries to find `prompt_toolkit` and run interactive command-line interface. And if none of these libraries are found, it exits with an error. But you can explicitly determine the user interface type using `--ui` -- `python3 main.py --ui=gtk` -- `python3 main.py --ui=tk` -- `python3 main.py --ui=cmd` +- `python3 main.py --ui=gtk` +- `python3 main.py --ui=tk` +- `python3 main.py --ui=cmd` -## Installation on Windows +Installation on Windows +----------------------- -- [Download and install Python](https://www.python.org/downloads/windows/) (3.10 or above) -- Open Start -> type Command -> right-click on Command Prompt -> Run as administrator -- To ensure you have `pip`, run: `python -m ensurepip --upgrade` -- To install, run: `pip install --upgrade pyglossary` -- Now you should be able to run `pyglossary` command -- If command was not found, make sure Python environment variables are set up: - +- [Download and install Python](https://www.python.org/downloads/windows/) (3.10 or above) +- Open Start -> type Command -> right-click on Command Prompt -> Run as administrator +- To ensure you have `pip`, run: `python -m ensurepip --upgrade` +- To install, run: `pip install --upgrade pyglossary` +- Now you should be able to run `pyglossary` command +- If command was not found, make sure Python environment variables are set up: -## Feature-specific requirements +Feature-specific requirements +----------------------------- -- Using [Sort by Locale](#sorting) feature requires [PyICU](./doc/pyicu.md) +- Using [Sort by Locale](#sorting) feature requires [PyICU](./doc/pyicu.md) -- Using `--remove-html-all` flag requires: +- Using `--remove-html-all` flag requires: - `pip install lxml beautifulsoup4` +`pip install lxml beautifulsoup4` -Some formats have additional requirements. -If you have trouble with any format, please check the [link given for that format](#supported-formats) to see its documentations. +Some formats have additional requirements. If you have trouble with any format, please check the [link given for that format](#supported-formats) to see its documentations. **Using Termux on Android?** See [doc/termux.md](./doc/termux.md) -## Configuration +Configuration +------------- See [doc/config.rst](./doc/config.rst). -## Direct and indirect modes +Direct and indirect modes +------------------------- -Indirect mode means the input glossary is completely read and loaded into RAM, then converted -into the output format. This was the only method available in old versions (before [3.0.0](https://github.com/ilius/pyglossary/releases/tag/3.0.0)). +Indirect mode means the input glossary is completely read and loaded into RAM, then converted into the output format. This was the only method available in old versions (before [3.0.0](https://github.com/ilius/pyglossary/releases/tag/3.0.0)). Direct mode means entries are one-at-a-time read, processed and written into output glossary. -Direct mode was added to limit the memory usage for large glossaries; But it may reduce the -conversion time for most cases as well. +Direct mode was added to limit the memory usage for large glossaries; But it may reduce the conversion time for most cases as well. Converting glossaries into these formats requires [sorting](#sorting) entries: -- [StarDict](./doc/p/stardict.md) -- [EPUB-2](./doc/p/epub2.md) -- [Mobipocket E-Book](./doc/p/mobi.md) +- [StarDict](./doc/p/stardict.md) +- [EPUB-2](./doc/p/epub2.md) +- [Mobipocket E-Book](./doc/p/mobi.md) -That's why direct mode will not work for these formats, and PyGlossary has to -switch to indirect mode (or it previously had to, see [SQLite mode](#sqlite-mode)). +That's why direct mode will not work for these formats, and PyGlossary has to switch to indirect mode (or it previously had to, see [SQLite mode](#sqlite-mode)). For other formats, direct mode will be the default. You may override this by `--indirect` flag. -## SQLite mode +SQLite mode +----------- -As mentioned above, converting glossaries to some specific formats will -need them to loaded into RAM. +As mentioned above, converting glossaries to some specific formats will need them to loaded into RAM. -This can be problematic if the glossary is too big to fit into RAM. That's when -you should try adding `--sqlite` flag to your command. Then it uses SQLite3 as intermediate -storage for storing, sorting and then fetching entries. This fixes the memory issue, and may -even reduce running time of conversion (depending on your home directory storage). +This can be problematic if the glossary is too big to fit into RAM. That's when you should try adding `--sqlite` flag to your command. Then it uses SQLite3 as intermediate storage for storing, sorting and then fetching entries. This fixes the memory issue, and may even reduce running time of conversion (depending on your home directory storage). -The temporary SQLite file is stored in [cache directory](#cache-directory) then -deleted after conversion (unless you pass `--no-cleanup` flag). +The temporary SQLite file is stored in [cache directory](#cache-directory) then deleted after conversion (unless you pass `--no-cleanup` flag). -SQLite mode is automatically enabled for writing these formats if `auto_sqlite` -[config parameter](./doc/config.rst) is `true` (which is the default). -This also applies to when you pass `--sort` flag for any format. -You may use `--no-sqlite` to override this and switch to indirect mode. +SQLite mode is automatically enabled for writing these formats if `auto_sqlite` [config parameter](./doc/config.rst) is `true` (which is the default). This also applies to when you pass `--sort` flag for any format. You may use `--no-sqlite` to override this and switch to indirect mode. Currently you can not disable alternates in SQLite mode (`--no-alts` is ignored). -## Sorting +Sorting +------- There are two things than can activate sorting entries: -- Output format requires sorting (as explained [above](#direct-and-indirect-modes)) -- You pass `--sort` flag in command line. +- Output format requires sorting (as explained [above](#direct-and-indirect-modes)\) +- You pass `--sort` flag in command line. In the case of passing `--sort`, you can also pass: -- `--sort-key` to select sort key aka sorting order (including locale), see [doc/sort-key.md](./doc/sort-key.md) +- `--sort-key` to select sort key aka sorting order (including locale), see [doc/sort-key.md](./doc/sort-key.md) -- `--sort-encoding` to change the encoding used for sort +- `--sort-encoding` to change the encoding used for sort - - UTF-8 is the default encoding for all sort keys and all output formats (unless mentioned otherwise) - - This will only effect the order of entries, and will not corrupt words / definition - - Non-encodable characters are replaced with `?` byte (*only for sorting*) + - UTF-8 is the default encoding for all sort keys and all output formats (unless mentioned otherwise) + - This will only effect the order of entries, and will not corrupt words / definition + - Non-encodable characters are replaced with `?` byte (*only for sorting*\) -## Cache directory +Cache directory +--------------- -Cache directory is used for storing temporary files which are either moved or deleted -after conversion. You can pass `--no-cleanup` flag in order to keep them. +Cache directory is used for storing temporary files which are either moved or deleted after conversion. You can pass `--no-cleanup` flag in order to keep them. The path for cache directory: -- Linux or BSD: `~/.cache/pyglossary/` -- Mac: `~/Library/Caches/PyGlossary/` -- Windows: `C:\Users\USERNAME\AppData\Local\PyGlossary\Cache\` +- Linux or BSD: `~/.cache/pyglossary/` +- Mac: `~/Library/Caches/PyGlossary/` +- Windows: `C:\Users\USERNAME\AppData\Local\PyGlossary\Cache\` -## User plugins +User plugins +------------ -If you want to add your own plugin without adding it to source code directory, -or you want to use a plugin that has been removed from repository, -you can place it in this directory: +If you want to add your own plugin without adding it to source code directory, or you want to use a plugin that has been removed from repository, you can place it in this directory: -- Linux or BSD: `~/.pyglossary/plugins/` -- Mac: `~/Library/Preferences/PyGlossary/plugins/` -- Windows: `C:\Users\USERNAME\AppData\Roaming\PyGlossary\plugins\` +- Linux or BSD: `~/.pyglossary/plugins/` +- Mac: `~/Library/Preferences/PyGlossary/plugins/` +- Windows: `C:\Users\USERNAME\AppData\Roaming\PyGlossary\plugins\` -## Using PyGlossary as a Python library +Using PyGlossary as a Python library +------------------------------------ There are a few examples in [doc/lib-examples](./doc/lib-examples) directory. @@ -351,15 +333,16 @@ with open(os.path.join(imageDir, "a.jpeg")) as fp: The first argument to `newDataEntry` must be the relative path (that generally html codes of your definitions points to). -## Internal glossary structure +Internal glossary structure +--------------------------- A glossary contains a number of entries. Each entry contains: -- Headword (title or main phrase for lookup) -- Alternates (some alternative phrases for lookup) -- Definition +- Headword (title or main phrase for lookup) +- Alternates (some alternative phrases for lookup) +- Definition In PyGlossary, headword and alternates together are accessible as a single Python list `entry.l_word` @@ -369,19 +352,17 @@ In PyGlossary, headword and alternates together are accessible as a single Pytho There is another type of entry which is called **Data Entry**, and generally contains an image, audio, css, or any other file that was included in input glossary. For data entries: -- `entry.s_word` is file name (and `l_word` is still a list containing this string), -- `entry.defiFormat` is `b` -- `entry.data` gives the content of file in `bytes`. +- `entry.s_word` is file name (and `l_word` is still a list containing this string), +- `entry.defiFormat` is `b` +- `entry.data` gives the content of file in `bytes`. -## Entry filters +Entry filters +------------- -Entry filters are internal objects that modify words/definition of entries, -or remove entries (in some special cases). +Entry filters are internal objects that modify words/definition of entries, or remove entries (in some special cases). -Like several filters in a pipe which connects a `reader` object to a `writer` object -(with both of their classes defined in plugins and instantiated in `Glossary` class). +Like several filters in a pipe which connects a `reader` object to a `writer` object (with both of their classes defined in plugins and instantiated in `Glossary` class). -You can enable/disable some of these filters using config parameters / command like flags, which -are documented in [doc/config.rst](./doc/config.rst). +You can enable/disable some of these filters using config parameters / command like flags, which are documented in [doc/config.rst](./doc/config.rst). The full list of entry filters is also documented in [doc/entry-filters.md](./doc/entry-filters.md). diff --git a/doc/apple.md b/doc/apple.md index 8c85ca345..49fb163ad 100644 --- a/doc/apple.md +++ b/doc/apple.md @@ -1,28 +1,23 @@ ### Required Python libraries for AppleDict -- **Reading from AppleDict Binary (.dictionary)** +- **Reading from AppleDict Binary (.dictionary)** - `sudo pip3 install lxml` +`sudo pip3 install lxml` -- **Writing to AppleDict** +- **Writing to AppleDict** - `sudo pip3 install lxml beautifulsoup4 html5lib` +`sudo pip3 install lxml beautifulsoup4 html5lib` ### Requirements for AppleDict on Mac OS X -If you want to convert glossaries into AppleDict format on Mac OS X, -you also need: +If you want to convert glossaries into AppleDict format on Mac OS X, you also need: -- GNU make as part of [Command Line Tools for - Xcode](http://developer.apple.com/downloads). -- Dictionary Development Kit as part of [Additional Tools for - Xcode](http://developer.apple.com/downloads). Extract to - `/Applications/Utilities/Dictionary Development Kit` +- GNU make as part of [Command Line Tools for Xcode](http://developer.apple.com/downloads). +- Dictionary Development Kit as part of [Additional Tools for Xcode](http://developer.apple.com/downloads). Extract to`/Applications/Utilities/Dictionary Development Kit` ### Convert Babylon (bgl) to Mac OS X dictionary -Let's assume the Babylon dict is at -`~/Documents/Duden_Synonym/Duden_Synonym.BGL`: +Let's assume the Babylon dict is at `~/Documents/Duden_Synonym/Duden_Synonym.BGL`: ```sh cd ~/Documents/Duden_Synonym/ @@ -36,8 +31,7 @@ Launch Dictionary.app and test. ### Convert Octopus Mdict to Mac OS X dictionary -Let's assume the MDict dict is at -`~/Documents/Duden-Oxford/Duden-Oxford DEED ver.20110408.mdx`. +Let's assume the MDict dict is at `~/Documents/Duden-Oxford/Duden-Oxford DEED ver.20110408.mdx`. Run the following command: @@ -51,8 +45,7 @@ make install Launch Dictionary.app and test. -Let's assume the MDict dict is at `~/Downloads/oald8/oald8.mdx`, along -with the image/audio resources file `oald8.mdd`. +Let's assume the MDict dict is at `~/Downloads/oald8/oald8.mdx`, along with the image/audio resources file `oald8.mdd`. Run the following commands: : @@ -62,23 +55,20 @@ python3 ~/Software/pyglossary/main.py --write-format=AppleDict oald8.mdx oald8-a cd oald8-apple ``` -This extracts dictionary into `oald8.xml` and data resources into folder -`OtherResources`. Hyperlinks use relative path. : +This extracts dictionary into `oald8.xml` and data resources into folder `OtherResources`. Hyperlinks use relative path. : ```sh sed -i "" 's:src="/:src=":g' oald8.xml ``` -Convert audio file from SPX format to WAV format. You need package -`speex` from [MacPorts](https://www.macports.org) : +Convert audio file from SPX format to WAV format. You need package `speex` from [MacPorts](https://www.macports.org) : ```sh find OtherResources -name "*.spx" -execdir sh -c 'spx={};speexdec $spx ${spx%.*}.wav' \; sed -i "" 's|sound://\([/_a-zA-Z0-9]*\).spx|\1.wav|g' oald8.xml ``` -But be warned that the decoded WAVE audio can consume ~5 times more disk -space! +But be warned that the decoded WAVE audio can consume ~5 times more disk space! Compile and install. : diff --git a/doc/babylon/bgl_info.md b/doc/babylon/bgl_info.md index bdefc9cff..4b3209a2c 100644 --- a/doc/babylon/bgl_info.md +++ b/doc/babylon/bgl_info.md @@ -1,16 +1,15 @@ -## bgl_numEntries (0x0c) +bgl_numEntries (0x0c) +--------------------- -`bgl_numEntries` does not always matches the number of entries in the dictionary, but it's close to it. -The difference is usually +-1 or 2, in rare cases may be 9, 29 and more. +`bgl_numEntries` does not always matches the number of entries in the dictionary, but it's close to it. The difference is usually +-1 or 2, in rare cases may be 9, 29 and more. -## bgl_length (0x43) +bgl_length (0x43) +----------------- -The length of the substring match in a term. -For example, if your glossary contains the term "Dog" and the substring length is 2, -search of the substrings "Do" or "og" will retrieve the term dog. -Use substring length 0 for exact match. +The length of the substring match in a term. For example, if your glossary contains the term "Dog" and the substring length is 2, search of the substrings "Do" or "og" will retrieve the term dog. Use substring length 0 for exact match. -## bgl_contractions (0x3b) +bgl_contractions (0x3b) +----------------------- Contains a value like this: @@ -20,10 +19,10 @@ V-0#Verb|V-0.0#|V-0.1#Infinitive|V-0.1.1#|V-1.0#|V-1.1#|V-1.1.1#Present Simple|V Value format: `( "#" [] "|")+` -The value is in second language, that is for `Babylon Russian-English.BGL` the value in russian. -For `Babylon English-Spanish.BGL` the value is spanish (I guess), etc. +The value is in second language, that is for `Babylon Russian-English.BGL` the value in russian. For `Babylon English-Spanish.BGL` the value is spanish (I guess), etc. -## bgl_about: Glossary manual file (0x41) +bgl_about: Glossary manual file (0x41) +-------------------------------------- Additional information about the dictionary in `.txt` format this may be short info like this: @@ -38,15 +37,14 @@ English biological articles to fluent Farsi Copyright (c) 2009 All rights reserved. ``` -In `.pdf` format this may be a quite large document (about 30 pages), -an introduction into the dictionary. It describing structure of an article, -editors, how to use the dictionary. +In `.pdf` format this may be a quite large document (about 30 pages), an introduction into the dictionary. It describing structure of an article, editors, how to use the dictionary. Format: ` "\x00" ` File extension may be: ".txt", ".pdf" -## bgl_purchaseLicenseMsg (0x2c) +bgl_purchaseLicenseMsg (0x2c) +----------------------------- Contains a value like this: @@ -55,7 +53,8 @@ In order to view this glossary, you must purchase a license.
Click here to purchase. ``` -## bgl_licenseExpiredMsg (0x2d) +bgl_licenseExpiredMsg (0x2d) +---------------------------- Contains a value like this: @@ -65,7 +64,8 @@ In order to view this glossary, you must have a valid license.
Renew your license today. ``` -## bgl_purchaseAddress (0x2e) +bgl_purchaseAddress (0x2e) +-------------------------- Contains a value like this: diff --git a/doc/entry-filters.md b/doc/entry-filters.md index eee7b2d1f..93efc7c1a 100644 --- a/doc/entry-filters.md +++ b/doc/entry-filters.md @@ -1,7 +1,8 @@ -## Entry Filters +Entry Filters +------------- | Name | Default Enabled | Command Flags | Description | -| ---------------------------- | --------------- | ------------------------------------ | --------------------------------------------------------------------------- | +|------------------------------|-----------------|--------------------------------------|-----------------------------------------------------------------------------| | `trim_whitespaces` | Yes | | Remove leading/trailing whitespaces from word(s) and definition | | `non_empty_word` | Yes | | Skip entries with empty word | | `skip_resources` | No | `--skip-resources` | Skip resources / data files | diff --git a/doc/lzo.md b/doc/lzo.md index 8643801ea..74aecc4ee 100644 --- a/doc/lzo.md +++ b/doc/lzo.md @@ -1,18 +1,19 @@ -## Install `python-lzo` +Install `python-lzo` +-------------------- -- **On Linux** +- **On Linux** - - Make sure `liblzo2-dev` or `liblzo2-devel` is installed. - - Run `sudo pip3 install python-lzo` + - Make sure `liblzo2-dev` or `liblzo2-devel` is installed. + - Run `sudo pip3 install python-lzo` -- **On Android with Termux** +- **On Android with Termux** - - `apt install liblzo` - - `pip install python-lzo` + - `apt install liblzo` + - `pip install python-lzo` -- **On Windows**: +- **On Windows**: - - Open this page: https://www.lfd.uci.edu/~gohlke/pythonlibs/#python-lzo - - If you are using Python 3.7 (32 bit) for example, click on `python_lzo‑1.12‑cp37‑cp37m‑win32.whl` - - Open Start -> type Command -> right-click on Command Prompt -> Run as administrator - - Run `pip install C:\....\python_lzo‑1.12‑cp37‑cp37m‑win32.whl` command, giving the path of downloaded file + - Open this page: https://www.lfd.uci.edu/~gohlke/pythonlibs/#python-lzo + - If you are using Python 3.7 (32 bit) for example, click on `python_lzo‑1.12‑cp37‑cp37m‑win32.whl` + - Open Start -> type Command -> right-click on Command Prompt -> Run as administrator + - Run `pip install C:\....\python_lzo‑1.12‑cp37‑cp37m‑win32.whl` command, giving the path of downloaded file diff --git a/doc/octopus_mdict/README.md b/doc/octopus_mdict/README.md index 4639ca3dd..9beeafdd8 100644 --- a/doc/octopus_mdict/README.md +++ b/doc/octopus_mdict/README.md @@ -1,22 +1,27 @@ -# An Analysis of MDX/MDD File Format +An Analysis of MDX/MDD File Format +================================== > MDict is a multi-platform open dictionary -which are both questionable. It is not available for every platform, e.g. OS X, Linux. Its dictionary file format is not open. But this has not hindered its popularity, and many dictionaries have been created for it. +which are both questionable. It is not available for every platform, e.g. OS X, Linux. Its dictionary file format is not open. But this has not hindered its popularity, and many dictionaries have been created for it. This is an attempt to reveal MDX/MDD file format, so that my favarite dictionaries, created by MDict users, could be used elsewhere. -# MDict Files +MDict Files +=========== MDict stores the dictionary definitions, i.e. (key word, explanation) in MDX file and the dictionary reference data, e.g. images, pronunciations, stylesheets in MDD file. Although holding different contents, these two file formats share the same structure. -# MDX and MDD File Formats +MDX and MDD File Formats +======================== See [MDX.svgz](./MDX.svgz) and [MDD.svgz](./MDD.svgz) -# Example Programs +Example Programs +================ -## readmdict.py +readmdict.py +------------ readmdict.py is an example implementation in Python. This program can read/extract mdx/mdd files. @@ -51,8 +56,7 @@ Out[4]: '.........') ``` -`mdx` is an object having all info from a MDX file. `items` is an iterator producing 2-item tuples. -Of each tuple, the first element is the entry text and the second is the explanation. Both are UTF-8 encoded strings. +`mdx` is an object having all info from a MDX file. `items` is an iterator producing 2-item tuples. Of each tuple, the first element is the entry text and the second is the explanation. Both are UTF-8 encoded strings. Read MDD file and print the first entry:: @@ -67,6 +71,4 @@ Out[7]: '\xff\xd8\xff\xe0\x00\x10JFIF...........') ``` -`mdd` is an object having all info from a MDD file. `items` is an iterator producing 2-item tuples. -Of each tuple, the first element is the file name and the second element is the corresponding file content. -The file name is encoded in UTF-8. The file content is a plain bytes array. +`mdd` is an object having all info from a MDD file. `items` is an iterator producing 2-item tuples. Of each tuple, the first element is the file name and the second element is the corresponding file content. The file name is encoded in UTF-8. The file content is a plain bytes array. diff --git a/doc/p/__index__.md b/doc/p/__index__.md index a8f3e037b..684bfed98 100644 --- a/doc/p/__index__.md +++ b/doc/p/__index__.md @@ -1,5 +1,5 @@ | Description | Name | Doc Link | -| ------------------------------ | --------------- | ---------------------------------------------- | +|--------------------------------|-----------------|------------------------------------------------| | Aard 2 (.slob) | Aard2Slob | [aard2_slob.md](./aard2_slob.md) | | ABC Medical Notes (SQLite3) | ABCMedicalNotes | [abc_medical_notes.md](./abc_medical_notes.md) | | Almaany.com (SQLite3) | Almaany | [almaany.md](./almaany.md) | diff --git a/doc/p/aard2_slob.md b/doc/p/aard2_slob.md index df0b67f8e..8023608b0 100644 --- a/doc/p/aard2_slob.md +++ b/doc/p/aard2_slob.md @@ -1,9 +1,10 @@ -## Aard 2 (.slob) +Aard 2 (.slob) +-------------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------- | +|-----------------|----------------------------------------------------------| | Name | Aard2Slob | | snake_case_name | aard2_slob | | Description | Aard 2 (.slob) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [@itkach/slob/wiki](https://github.com/itkach/slob/wiki) | | Website | [aarddict.org](http://aarddict.org/) | ### Write options | Name | Default | Type | Comment | -| ---------------------------------- | ------- | ---- | --------------------------------------------------------------- | +|------------------------------------|---------|------|-----------------------------------------------------------------| | compression | `zlib` | str | Compression Algorithm | | content_type | | str | Content Type | | file_size_approx | `0` | int | split up by given approximate file size
examples: 100m, 1g | @@ -47,6 +48,6 @@ See [doc/pyicu.md](../pyicu.md) file for more detailed instructions on how to in ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ------------------------------------------ | ---------------------------------------------------------------- | ------- | --------- | -------- | +|--------------------------------------------|------------------------------------------------------------------|---------|-----------|----------| | [Aard 2 for Android](http://aarddict.org/) | [@itkach/aard2-android](https://github.com/itkach/aard2-android) | GPL | Android | Java | | [Aard2 for Web](http://aarddict.org/) | [@itkach/aard2-web](https://github.com/itkach/aard2-web) | MPL | Web | Java | diff --git a/doc/p/abc_medical_notes.md b/doc/p/abc_medical_notes.md index 99d720d51..41e41ebfc 100644 --- a/doc/p/abc_medical_notes.md +++ b/doc/p/abc_medical_notes.md @@ -1,9 +1,10 @@ -## ABC Medical Notes (SQLite3) +ABC Medical Notes (SQLite3) +--------------------------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------------------------------------------------------------------- | +|-----------------|------------------------------------------------------------------------------------------------------------------------| | Name | ABCMedicalNotes | | snake_case_name | abc_medical_notes | | Description | ABC Medical Notes (SQLite3) | @@ -11,14 +12,14 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [ABC Medical Notes 2021 - Google Play](https://play.google.com/store/apps/details?id=com.pocketmednotes2014.secondapp) | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------------------------------------------------------- | ----------- | ------- | --------- | -------- | +|----------------------------------------------------------------------------------------------------------|-------------|---------|-----------|----------| | [ABC Medical Notes 2020](https://play.google.com/store/apps/details?id=com.pocketmednotes2014.secondapp) | ― | Unknown | Android | | diff --git a/doc/p/almaany.md b/doc/p/almaany.md index 1d38aa6b0..922169e68 100644 --- a/doc/p/almaany.md +++ b/doc/p/almaany.md @@ -1,9 +1,10 @@ -## Almaany.com (SQLite3) +Almaany.com (SQLite3) +--------------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------------------------------------------------- | +|-----------------|---------------------------------------------------------------------------------------------------------------| | Name | Almaany | | snake_case_name | almaany | | Description | Almaany.com (SQLite3) | @@ -11,14 +12,14 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [Almaany.com Arabic Dictionary - Google Play](https://play.google.com/store/apps/details?id=com.almaany.arar) | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ----------------------------------------------------------------------------------------------- | ----------- | ------- | --------- | -------- | +|-------------------------------------------------------------------------------------------------|-------------|---------|-----------|----------| | [Almaany.com Arabic Dictionary](https://play.google.com/store/apps/details?id=com.almaany.arar) | ― | Unknown | Android | | diff --git a/doc/p/appledict.md b/doc/p/appledict.md index 74d907675..f79dd2b30 100644 --- a/doc/p/appledict.md +++ b/doc/p/appledict.md @@ -1,9 +1,10 @@ -## AppleDict Source +AppleDict Source +---------------- ### General Information | Attribute | Value | -| --------------- | --------------------------------------------------------------------------------------------- | +|-----------------|-----------------------------------------------------------------------------------------------| | Name | AppleDict | | snake_case_name | appledict | | Description | AppleDict Source | @@ -11,16 +12,16 @@ | Read support | No | | Write support | Yes | | Single-file | No | -| Kind | πŸ“ directory | +| Kind | πŸ“ directory | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [Dictionary User Guide for Mac](https://support.apple.com/en-gu/guide/dictionary/welcome/mac) | ### Write options | Name | Default | Type | Comment | -| ----------------- | ------- | ---- | ---------------------------------------- | +|-------------------|---------|------|------------------------------------------| | clean_html | `True` | bool | use BeautifulSoup parser | | css | | str | custom .css file path | | xsl | | str | custom XSL transformations file path | @@ -47,5 +48,5 @@ See [doc/apple.md](../apple.md) for additional AppleDict instructions. ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ------------------------------------------------------------------------------------------- | ----------- | ------- | --------- | -------- | +|---------------------------------------------------------------------------------------------|-------------|---------|-----------|----------| | [Dictionary Development Kit](https://github.com/SebastianSzturo/Dictionary-Development-Kit) | ― | Unknown | Mac | | diff --git a/doc/p/appledict_bin.md b/doc/p/appledict_bin.md index 69396cd80..75f3c01bd 100644 --- a/doc/p/appledict_bin.md +++ b/doc/p/appledict_bin.md @@ -1,9 +1,10 @@ -## AppleDict Binary +AppleDict Binary +---------------- ### General Information | Attribute | Value | -| --------------- | --------------------------------------------------------------------------------------------- | +|-----------------|-----------------------------------------------------------------------------------------------| | Name | AppleDictBin | | snake_case_name | appledict_bin | | Description | AppleDict Binary | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [Dictionary User Guide for Mac](https://support.apple.com/en-gu/guide/dictionary/welcome/mac) | ### Read options | Name | Default | Type | Comment | -| --------- | ------- | ---- | --------------------------------------------------- | +|-----------|---------|------|-----------------------------------------------------| | html | `True` | bool | Entries are HTML | | html_full | `True` | bool | Turn every entry's definition into an HTML document | @@ -37,5 +38,5 @@ pip3 install lxml biplist ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------------------------------- | ----------- | ----------- | --------- | -------- | +|----------------------------------------------------------------------------------|-------------|-------------|-----------|----------| | [Apple Dictionary](https://support.apple.com/en-gu/guide/dictionary/welcome/mac) | ― | Proprietary | Mac | | diff --git a/doc/p/ayandict_sqlite.md b/doc/p/ayandict_sqlite.md index 7e8d64752..e33a0fc2a 100644 --- a/doc/p/ayandict_sqlite.md +++ b/doc/p/ayandict_sqlite.md @@ -1,9 +1,10 @@ -## AyanDict SQLite +AyanDict SQLite +--------------- ### General Information | Attribute | Value | -| --------------- | --------------------------------------------------- | +|-----------------|-----------------------------------------------------| | Name | AyanDictSQLite | | snake_case_name | ayandict_sqlite | | Description | AyanDict SQLite | @@ -11,14 +12,14 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [ilius/ayandict](https://github.com/ilius/ayandict) | ### Write options | Name | Default | Type | Comment | -| ----- | ------- | ---- | ------------------------ | +|-------|---------|------|--------------------------| | fuzzy | `True` | bool | Create fuzzy search data | diff --git a/doc/p/babylon_bgl.md b/doc/p/babylon_bgl.md index f9df7be22..20a2544f0 100644 --- a/doc/p/babylon_bgl.md +++ b/doc/p/babylon_bgl.md @@ -1,26 +1,27 @@ -## Babylon (.BGL) +Babylon (.BGL) +-------------- ### General Information -| Attribute | Value | -| --------------- | ------------------ | -| Name | BabylonBgl | -| snake_case_name | babylon_bgl | -| Description | Babylon (.BGL) | -| Extensions | `.bgl` | -| Read support | Yes | -| Write support | No | -| Single-file | Yes | -| Kind | πŸ”’ binary | -| Sort-on-write | default_no | -| Sort key | (`headword_lower`) | -| Wiki | ― | -| Website | ― | +| Attribute | Value | +|-----------------|----------------------| +| Name | BabylonBgl | +| snake_case_name | babylon_bgl | +| Description | Babylon (.BGL) | +| Extensions | `.bgl` | +| Read support | Yes | +| Write support | No | +| Single-file | Yes | +| Kind | πŸ”’ binary | +| Sort-on-write | default_no | +| Sort key | \(`headword_lower`\) | +| Wiki | ― | +| Website | ― | ### Read options | Name | Default | Type | Comment | -| --------------------------- | -------- | ---- | ------------------------------------------- | +|-----------------------------|----------|------|---------------------------------------------| | default_encoding_overwrite | | str | Default encoding (overwrite) | | source_encoding_overwrite | | str | Source encoding (overwrite) | | target_encoding_overwrite | | str | Target encoding (overwrite) | @@ -33,7 +34,7 @@ ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ------------------------------------------------------- | ----------- | ----------- | -------------- | -------- | +|---------------------------------------------------------|-------------|-------------|----------------|----------| | [Babylon Translator](https://www.babylon-software.com/) | ― | Freemium | Windows | | | [GoldenDict](http://goldendict.org/) | ― | GPL | Linux, Windows | | | [GoldenDict Mobile (Free)](http://goldendict.mobi/) | ― | Freeware | Android | | diff --git a/doc/p/cc_kedict.md b/doc/p/cc_kedict.md index 50d9ca8fa..ad2724541 100644 --- a/doc/p/cc_kedict.md +++ b/doc/p/cc_kedict.md @@ -1,9 +1,10 @@ -## cc-kedict +cc-kedict +--------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------------- | +|-----------------|----------------------------------------------------------------| | Name | cc-kedict | | snake_case_name | cc_kedict | | Description | cc-kedict | @@ -11,9 +12,9 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [@mhagiwara/cc-kedict](https://github.com/mhagiwara/cc-kedict) | diff --git a/doc/p/crawler_dir.md b/doc/p/crawler_dir.md index f55eac9b7..a7f9448c3 100644 --- a/doc/p/crawler_dir.md +++ b/doc/p/crawler_dir.md @@ -1,24 +1,25 @@ -## Crawler Directory +Crawler Directory +----------------- ### General Information -| Attribute | Value | -| --------------- | ------------------ | -| Name | CrawlerDir | -| snake_case_name | crawler_dir | -| Description | Crawler Directory | -| Extensions | `.crawler` | -| Read support | Yes | -| Write support | Yes | -| Single-file | Yes | -| Kind | πŸ“ directory | -| Sort-on-write | default_no | -| Sort key | (`headword_lower`) | -| Wiki | ― | -| Website | ― | +| Attribute | Value | +|-----------------|----------------------| +| Name | CrawlerDir | +| snake_case_name | crawler_dir | +| Description | Crawler Directory | +| Extensions | `.crawler` | +| Read support | Yes | +| Write support | Yes | +| Single-file | Yes | +| Kind | πŸ“ directory | +| Sort-on-write | default_no | +| Sort key | \(`headword_lower`\) | +| Wiki | ― | +| Website | ― | ### Write options | Name | Default | Type | Comment | -| ----------- | ------- | ---- | --------------------- | +|-------------|---------|------|-----------------------| | compression | | str | Compression Algorithm | diff --git a/doc/p/csv.md b/doc/p/csv.md index 4ea9b1992..bdeb4531a 100644 --- a/doc/p/csv.md +++ b/doc/p/csv.md @@ -1,9 +1,10 @@ -## CSV (.csv) +CSV (.csv) +---------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------------------ | +|-----------------|--------------------------------------------------------------------------------| | Name | Csv | | snake_case_name | csv | | Description | CSV (.csv) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Comma-separated values](https://en.wikipedia.org/wiki/Comma-separated_values) | | Website | ― | ### Read options | Name | Default | Type | Comment | -| --------- | ------- | ---- | ---------------- | +|-----------|---------|------|------------------| | encoding | `utf-8` | str | Encoding/charset | | newline | `\n` | str | Newline string | | delimiter | `,` | str | Column delimiter | @@ -28,7 +29,7 @@ ### Write options | Name | Default | Type | Comment | -| --------------- | ------- | ---- | ---------------------------------------------- | +|-----------------|---------|------|------------------------------------------------| | encoding | `utf-8` | str | Encoding/charset | | newline | `\n` | str | Newline string | | resources | `True` | bool | Enable resources / data files | @@ -40,6 +41,6 @@ ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ---------------------------------------------------------------------- | ----------- | ----------- | ------------------- | -------- | +|------------------------------------------------------------------------|-------------|-------------|---------------------|----------| | [LibreOffice Calc](https://www.libreoffice.org/discover/calc/) | ― | MPL/GPL | Linux, Windows, Mac | | | [Microsoft Excel](https://www.microsoft.com/en-us/microsoft-365/excel) | ― | Proprietary | Windows | | diff --git a/doc/p/dicformids.md b/doc/p/dicformids.md index 608bda35d..d8a7ef266 100644 --- a/doc/p/dicformids.md +++ b/doc/p/dicformids.md @@ -1,9 +1,10 @@ -## DictionaryForMIDs +DictionaryForMIDs +----------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------------ | +|-----------------|--------------------------------------------------------------------------| | Name | Dicformids | | snake_case_name | dicformids | | Description | DictionaryForMIDs | @@ -11,7 +12,7 @@ | Read support | Yes | | Write support | Yes | | Single-file | No | -| Kind | πŸ“ directory | +| Kind | πŸ“ directory | | Sort-on-write | always | | Sort key | `dicformids` | | Wiki | ― | @@ -20,5 +21,5 @@ ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ---------------------------------------------------------- | ----------- | ------- | --------------------------------- | -------- | +|------------------------------------------------------------|-------------|---------|-----------------------------------|----------| | [DictionaryForMIDs](http://dictionarymid.sourceforge.net/) | ― | GPL | Android, Web, Windows, Linux, Mac | Java | diff --git a/doc/p/dict_cc.md b/doc/p/dict_cc.md index 5059f4054..a53642a4c 100644 --- a/doc/p/dict_cc.md +++ b/doc/p/dict_cc.md @@ -1,9 +1,10 @@ -## Dict.cc (SQLite3) +Dict.cc (SQLite3) +----------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------------------------------------ | +|-----------------|--------------------------------------------------------------------------------------------------| | Name | Dictcc | | snake_case_name | dict_cc | | Description | Dict.cc (SQLite3) | @@ -11,14 +12,14 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Dict.cc](https://en.wikipedia.org/wiki/Dict.cc) | | Website | [dict.cc dictionary - Google Play](https://play.google.com/store/apps/details?id=cc.dict.dictcc) | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ---------------------------------------------------------------------------------- | ----------- | ----------- | --------- | -------- | +|------------------------------------------------------------------------------------|-------------|-------------|-----------|----------| | [dict.cc dictionary](https://play.google.com/store/apps/details?id=cc.dict.dictcc) | ― | Proprietary | Android | | diff --git a/doc/p/dict_cc_split.md b/doc/p/dict_cc_split.md index 1567ee786..f2768fb20 100644 --- a/doc/p/dict_cc_split.md +++ b/doc/p/dict_cc_split.md @@ -1,9 +1,10 @@ -## Dict.cc (SQLite3) - Split +Dict.cc (SQLite3) - Split +------------------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------------------------------------ | +|-----------------|--------------------------------------------------------------------------------------------------| | Name | Dictcc_split | | snake_case_name | dict_cc_split | | Description | Dict.cc (SQLite3) - Split | @@ -11,14 +12,14 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Dict.cc](https://en.wikipedia.org/wiki/Dict.cc) | | Website | [dict.cc dictionary - Google Play](https://play.google.com/store/apps/details?id=cc.dict.dictcc) | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ---------------------------------------------------------------------------------- | ----------- | ----------- | --------- | -------- | +|------------------------------------------------------------------------------------|-------------|-------------|-----------|----------| | [dict.cc dictionary](https://play.google.com/store/apps/details?id=cc.dict.dictcc) | ― | Proprietary | Android | | diff --git a/doc/p/dict_org.md b/doc/p/dict_org.md index e654dc3c4..e0d316b57 100644 --- a/doc/p/dict_org.md +++ b/doc/p/dict_org.md @@ -1,9 +1,10 @@ -## DICT.org file format (.index) +DICT.org file format (.index) +----------------------------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------------------------- | +|-----------------|------------------------------------------------------------------------------| | Name | DictOrg | | snake_case_name | dict_org | | Description | DICT.org file format (.index) | @@ -11,23 +12,23 @@ | Read support | Yes | | Write support | Yes | | Single-file | No | -| Kind | πŸ“ directory | +| Kind | πŸ“ directory | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [DICT#DICT file format](https://en.wikipedia.org/wiki/DICT#DICT_file_format) | | Website | [The DICT Development Group](http://dict.org/bin/Dict) | ### Write options | Name | Default | Type | Comment | -| ------- | ------- | ---- | --------------------------------------- | +|---------|---------|------|-----------------------------------------| | dictzip | `False` | bool | Compress .dict file to .dict.dz | | install | `True` | bool | Install dictionary to /usr/share/dictd/ | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| --------------------------------------------------------------- | ----------- | ------- | --------- | -------- | +|-----------------------------------------------------------------|-------------|---------|-----------|----------| | [Dictd](https://directory.fsf.org/wiki/Dictd) | ― | GPL | Linux | | | [GNOME Dictionary](https://wiki.gnome.org/Apps/Dictionary) | ― | GPL | Linux | | | [Xfce4 Dictionary](https://docs.xfce.org/apps/xfce4-dict/start) | ― | GPL | linux | | diff --git a/doc/p/dict_org_source.md b/doc/p/dict_org_source.md index e1f9935a4..9a075c07b 100644 --- a/doc/p/dict_org_source.md +++ b/doc/p/dict_org_source.md @@ -1,9 +1,10 @@ -## DICT.org dictfmt source file +DICT.org dictfmt source file +---------------------------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------- | +|-----------------|----------------------------------------------------| | Name | DictOrgSource | | snake_case_name | dict_org_source | | Description | DICT.org dictfmt source file | @@ -11,20 +12,20 @@ | Read support | No | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [DICT](https://en.wikipedia.org/wiki/DICT) | | Website | [@cheusov/dictd](https://github.com/cheusov/dictd) | ### Write options | Name | Default | Type | Comment | -| --------------- | ------- | ---- | -------------------- | +|-----------------|---------|------|----------------------| | remove_html_all | `True` | bool | Remove all HTML tags | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ---------------------------------------------- | ----------- | ------- | --------- | -------- | +|------------------------------------------------|-------------|---------|-----------|----------| | [dictfmt](https://linux.die.net/man/1/dictfmt) | ― | GPL | Linux | | diff --git a/doc/p/dictunformat.md b/doc/p/dictunformat.md index 91e5598b4..66e898d20 100644 --- a/doc/p/dictunformat.md +++ b/doc/p/dictunformat.md @@ -1,9 +1,10 @@ -## dictunformat output file +dictunformat output file +------------------------ ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------------------------------------------------------- | +|-----------------|------------------------------------------------------------------------------------------------------------| | Name | Dictunformat | | snake_case_name | dictunformat | | Description | dictunformat output file | @@ -11,21 +12,21 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Dictd](https://directory.fsf.org/wiki/Dictd) | | Website | [dictd/dictunformat.1.in - @cheusov/dictd](https://github.com/cheusov/dictd/blob/master/dictunformat.1.in) | ### Read options | Name | Default | Type | Comment | -| ------------------ | ------- | ---- | ------------------------------------- | +|--------------------|---------|------|---------------------------------------| | encoding | `utf-8` | str | Encoding/charset | -| headword_separator | `; ` | str | separator for headword and alternates | +| headword_separator | `;` | str | separator for headword and alternates | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------- | ----------- | ------- | --------- | -------- | +|----------------------------------------------------------|-------------|---------|-----------|----------| | [dictunformat](https://linux.die.net/man/1/dictunformat) | ― | GPL | Linux | | diff --git a/doc/p/digitalnk.md b/doc/p/digitalnk.md index 6524ebb1d..cf28cb588 100644 --- a/doc/p/digitalnk.md +++ b/doc/p/digitalnk.md @@ -1,9 +1,10 @@ -## DigitalNK (SQLite3, N-Korean) +DigitalNK (SQLite3, N-Korean) +----------------------------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------- | +|-----------------|----------------------------------------------------------| | Name | DigitalNK | | snake_case_name | digitalnk | | Description | DigitalNK (SQLite3, N-Korean) | @@ -11,14 +12,14 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [@digitalprk/dicrs](https://github.com/digitalprk/dicrs) | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| --------------------------------------------- | ----------- | ------------ | --------- | -------- | +|-----------------------------------------------|-------------|--------------|-----------|----------| | [Dic.rs](https://github.com/digitalprk/dicrs) | ― | BSD-2-Clause | Linux | | diff --git a/doc/p/dikt_json.md b/doc/p/dikt_json.md index b44a46b22..766dcab9b 100644 --- a/doc/p/dikt_json.md +++ b/doc/p/dikt_json.md @@ -1,9 +1,10 @@ -## DIKT JSON (.json) +DIKT JSON (.json) +----------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------ | +|-----------------|--------------------------------------| | Name | DiktJson | | snake_case_name | dikt_json | | Description | DIKT JSON (.json) | @@ -11,16 +12,16 @@ | Read support | No | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | https://github.com/maxim-saplin/dikt | ### Write options | Name | Default | Type | Comment | -| ----------- | ------- | ---- | ---------------------------------------------- | +|-------------|---------|------|------------------------------------------------| | encoding | `utf-8` | str | Encoding/charset | | enable_info | `True` | bool | Enable glossary info / metedata | | resources | `True` | bool | Enable resources / data files | diff --git a/doc/p/dsl.md b/doc/p/dsl.md index 8061d73a6..1209170df 100644 --- a/doc/p/dsl.md +++ b/doc/p/dsl.md @@ -1,9 +1,10 @@ -## ABBYY Lingvo DSL (.dsl) +ABBYY Lingvo DSL (.dsl) +----------------------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------- | +|-----------------|------------------------------------------------------------| | Name | ABBYYLingvoDSL | | snake_case_name | dsl | | Description | ABBYY Lingvo DSL (.dsl) | @@ -11,23 +12,23 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [ABBYY Lingvo](https://ru.wikipedia.org/wiki/ABBYY_Lingvo) | | Website | [www.lingvo.ru](https://www.lingvo.ru/) | ### Read options -| Name | Default | Type | Comment | -| ------------- | ----------- | ---- | ---------------------------------------------- | -| encoding | | str | Encoding/charset | -| audio | `True` | bool | Enable audio objects | -| example_color | `steelblue` | str | Examples color | -| abbrev | `hover` | str | Load and apply abbreviation file (`_abrv.dsl`) | +| Name | Default | Type | Comment | +|---------------|-------------|------|-------------------------------------------------| +| encoding | | str | Encoding/charset | +| audio | `True` | bool | Enable audio objects | +| example_color | `steelblue` | str | Examples color | +| abbrev | `hover` | str | Load and apply abbreviation file (`_abrv.dsl`\) | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------- | ----------- | ----------- | --------------------------------------------------- | -------- | +|----------------------------------------|-------------|-------------|-----------------------------------------------------|----------| | [ABBYY Lingvo](https://www.lingvo.ru/) | ― | Proprietary | Windows, Mac, Android, iOS, Windows Mobile, Symbian | | diff --git a/doc/p/edict2.md b/doc/p/edict2.md index 308fca5d5..93ac40ac9 100644 --- a/doc/p/edict2.md +++ b/doc/p/edict2.md @@ -1,9 +1,10 @@ -## EDICT2 (CEDICT) (.u8) +EDICT2 (CEDICT) (.u8) +--------------------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------- | +|-----------------|------------------------------------------------| | Name | EDICT2 | | snake_case_name | edict2 | | Description | EDICT2 (CEDICT) (.u8) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [CEDICT](https://en.wikipedia.org/wiki/CEDICT) | | Website | ― | ### Read options | Name | Default | Type | Comment | -| ----------------- | ------- | ---- | --------------------------------------------- | +|-------------------|---------|------|-----------------------------------------------| | encoding | `utf-8` | str | Encoding/charset | | traditional_title | `False` | bool | Use traditional Chinese for entry titles/keys | | colorize_tones | `True` | bool | Set to false to disable tones coloring | diff --git a/doc/p/edlin.md b/doc/p/edlin.md index ba38b1439..6663c78c0 100644 --- a/doc/p/edlin.md +++ b/doc/p/edlin.md @@ -1,31 +1,32 @@ -## EDLIN +EDLIN +----- ### General Information -| Attribute | Value | -| --------------- | ------------------ | -| Name | Edlin | -| snake_case_name | edlin | -| Description | EDLIN | -| Extensions | `.edlin` | -| Read support | Yes | -| Write support | Yes | -| Single-file | No | -| Kind | πŸ“ directory | -| Sort-on-write | default_no | -| Sort key | (`headword_lower`) | -| Wiki | ― | -| Website | ― | +| Attribute | Value | +|-----------------|----------------------| +| Name | Edlin | +| snake_case_name | edlin | +| Description | EDLIN | +| Extensions | `.edlin` | +| Read support | Yes | +| Write support | Yes | +| Single-file | No | +| Kind | πŸ“ directory | +| Sort-on-write | default_no | +| Sort key | \(`headword_lower`\) | +| Wiki | ― | +| Website | ― | ### Read options | Name | Default | Type | Comment | -| -------- | ------- | ---- | ---------------- | +|----------|---------|------|------------------| | encoding | `utf-8` | str | Encoding/charset | ### Write options | Name | Default | Type | Comment | -| --------- | ------- | ---- | ----------------------------- | +|-----------|---------|------|-------------------------------| | encoding | `utf-8` | str | Encoding/charset | | prev_link | `True` | bool | Enable link to previous entry | diff --git a/doc/p/epub2.md b/doc/p/epub2.md index 20d4282c9..1c682f4a8 100644 --- a/doc/p/epub2.md +++ b/doc/p/epub2.md @@ -1,9 +1,10 @@ -## EPUB-2 E-Book +EPUB-2 E-Book +------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------ | +|-----------------|--------------------------------------------| | Name | Epub2 | | snake_case_name | epub2 | | Description | EPUB-2 E-Book | @@ -11,7 +12,7 @@ | Read support | No | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“¦ package | +| Kind | πŸ“¦ package | | Sort-on-write | always | | Sort key | `ebook` | | Wiki | [EPUB](https://en.wikipedia.org/wiki/EPUB) | @@ -20,7 +21,7 @@ ### Write options | Name | Default | Type | Comment | -| ---------------------- | ------- | ---- | -------------------------- | +|------------------------|---------|------|----------------------------| | keep | `False` | bool | Keep temp files | | group_by_prefix_length | `2` | int | Prefix length for grouping | | include_index_page | `False` | bool | Include index page | @@ -31,7 +32,7 @@ ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------------------------- | ----------- | ----------- | ------------------- | -------- | +|----------------------------------------------------------------------------|-------------|-------------|---------------------|----------| | [calibre](https://calibre-ebook.com/) | ― | GPL | Linux, Windows, Mac | | | [Okular](https://okular.kde.org/) | ― | GPL | Linux, Windows, Mac | | | [Book Reader](https://f-droid.org/en/packages/com.github.axet.bookreader/) | ― | GPL | Android | | diff --git a/doc/p/freedict.md b/doc/p/freedict.md index 03a145d7b..91cb671e3 100644 --- a/doc/p/freedict.md +++ b/doc/p/freedict.md @@ -1,9 +1,10 @@ -## FreeDict (.tei) +FreeDict (.tei) +--------------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------------------------------- | +|-----------------|------------------------------------------------------------------------------------| | Name | FreeDict | | snake_case_name | freedict | | Description | FreeDict (.tei) | @@ -11,20 +12,20 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [@freedict/fd-dictionaries/wiki](https://github.com/freedict/fd-dictionaries/wiki) | | Website | [FreeDict.org](https://freedict.org/) | ### Read options | Name | Default | Type | Comment | -| --------------- | ------- | ---- | ------------------------------------------------ | +|-----------------|---------|------|--------------------------------------------------| | discover | `False` | bool | Find and show unsupported tags | | auto_rtl | `None` | bool | Auto-detect and mark Right-to-Left text | | auto_comma | `True` | bool | Auto-detect comma sign based on text | -| comma | `, ` | str | Comma sign (following space) to use as separator | +| comma | `,` | str | Comma sign (following space) to use as separator | | word_title | `False` | bool | Add headwords title to beginning of definition | | pron_color | `gray` | str | Pronunciation color | | gram_color | `green` | str | Grammar color | diff --git a/doc/p/gettext_po.md b/doc/p/gettext_po.md index bb8a94a5c..dfa2dca31 100644 --- a/doc/p/gettext_po.md +++ b/doc/p/gettext_po.md @@ -1,9 +1,10 @@ -## Gettext Source (.po) +Gettext Source (.po) +-------------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------- | +|-----------------|---------------------------------------------------------------| | Name | GettextPo | | snake_case_name | gettext_po | | Description | Gettext Source (.po) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Gettext](https://en.wikipedia.org/wiki/Gettext) | | Website | [gettext - GNU Project](https://www.gnu.org/software/gettext) | ### Write options | Name | Default | Type | Comment | -| --------- | ------- | ---- | ----------------------------- | +|-----------|---------|------|-------------------------------| | resources | `True` | bool | Enable resources / data files | ### Dependencies for reading and writing @@ -36,6 +37,6 @@ pip3 install polib ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ------------------------------------------------ | ----------- | --------------- | ------------------- | -------- | +|--------------------------------------------------|-------------|-----------------|---------------------|----------| | [gettext](https://www.gnu.org/software/gettext/) | ― | GPL | Linux, Windows | | | [poEdit](https://github.com/vslavik/poedit) | ― | MIT / Shareware | Linux, Windows, Mac | | diff --git a/doc/p/html_dir.md b/doc/p/html_dir.md index fd8dc5fd1..197f6c0fe 100644 --- a/doc/p/html_dir.md +++ b/doc/p/html_dir.md @@ -1,26 +1,27 @@ -## HTML Directory +HTML Directory +-------------- ### General Information -| Attribute | Value | -| --------------- | ------------------ | -| Name | HtmlDir | -| snake_case_name | html_dir | -| Description | HTML Directory | -| Extensions | `.hdir` | -| Read support | No | -| Write support | Yes | -| Single-file | No | -| Kind | πŸ“ directory | -| Sort-on-write | default_no | -| Sort key | (`headword_lower`) | -| Wiki | ― | -| Website | ― | +| Attribute | Value | +|-----------------|----------------------| +| Name | HtmlDir | +| snake_case_name | html_dir | +| Description | HTML Directory | +| Extensions | `.hdir` | +| Read support | No | +| Write support | Yes | +| Single-file | No | +| Kind | πŸ“ directory | +| Sort-on-write | default_no | +| Sort key | \(`headword_lower`\) | +| Wiki | ― | +| Website | ― | ### Write options | Name | Default | Type | Comment | -| --------------- | -------------- | ---- | ---------------------------------------------- | +|-----------------|----------------|------|------------------------------------------------| | encoding | `utf-8` | str | Encoding/charset | | resources | `True` | bool | Enable resources / data files | | max_file_size | `102400` | int | Maximum file size in bytes | diff --git a/doc/p/info.md b/doc/p/info.md index 4a6f01ebd..48c6ea0de 100644 --- a/doc/p/info.md +++ b/doc/p/info.md @@ -1,9 +1,10 @@ -## Glossary Info (.info) +Glossary Info (.info) +--------------------- ### General Information | Attribute | Value | -| --------------- | --------------------- | +|-----------------|-----------------------| | Name | Info | | snake_case_name | info | | Description | Glossary Info (.info) | @@ -11,8 +12,8 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | ― | diff --git a/doc/p/iupac_goldbook.md b/doc/p/iupac_goldbook.md index e92a8c50c..af45f3f4d 100644 --- a/doc/p/iupac_goldbook.md +++ b/doc/p/iupac_goldbook.md @@ -1,9 +1,10 @@ -## IUPAC goldbook (.xml) +IUPAC goldbook (.xml) +--------------------- ### General Information | Attribute | Value | -| --------------- | --------------------------- | +|-----------------|-----------------------------| | Name | IUPACGoldbook | | snake_case_name | iupac_goldbook | | Description | IUPAC goldbook (.xml) | @@ -11,9 +12,9 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | https://goldbook.iupac.org/ | diff --git a/doc/p/jmdict.md b/doc/p/jmdict.md index cfb95c483..56e194bab 100644 --- a/doc/p/jmdict.md +++ b/doc/p/jmdict.md @@ -1,9 +1,10 @@ -## JMDict (xml) +JMDict (xml) +------------ ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------------- | +|-----------------|------------------------------------------------------------------| | Name | JMDict | | snake_case_name | jmdict | | Description | JMDict (xml) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [JMdict](https://en.wikipedia.org/wiki/JMdict) | | Website | [The JMDict Project](https://www.edrdg.org/jmdict/j_jmdict.html) | ### Read options | Name | Default | Type | Comment | -| --------------- | ------- | ---- | -------------------------------------- | +|-----------------|---------|------|----------------------------------------| | example_padding | `10` | int | Padding for examples (in px) | | example_color | | str | Examples color | | translitation | `False` | bool | Add translitation (romaji) of keywords | diff --git a/doc/p/jmnedict.md b/doc/p/jmnedict.md index cd15d4942..5a17d7089 100644 --- a/doc/p/jmnedict.md +++ b/doc/p/jmnedict.md @@ -1,9 +1,10 @@ -## JMnedict +JMnedict +-------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------ | +|-----------------|--------------------------------------------------------------| | Name | JMnedict | | snake_case_name | jmnedict | | Description | JMnedict | @@ -11,9 +12,9 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [JMdict](https://en.wikipedia.org/wiki/JMdict) | | Website | [EDRDG Wiki](https://www.edrdg.org/wiki/index.php/Main_Page) | diff --git a/doc/p/json.md b/doc/p/json.md index 8822441e8..6574949e2 100644 --- a/doc/p/json.md +++ b/doc/p/json.md @@ -1,9 +1,10 @@ -## JSON (.json) +JSON (.json) +------------ ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------- | +|-----------------|---------------------------------------------------| | Name | Json | | snake_case_name | json | | Description | JSON (.json) | @@ -11,16 +12,16 @@ | Read support | No | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [JSON](https://en.wikipedia.org/wiki/JSON) | | Website | [www.json.org](https://www.json.org/json-en.html) | ### Write options | Name | Default | Type | Comment | -| ----------- | ------- | ---- | ---------------------------------------------- | +|-------------|---------|------|------------------------------------------------| | encoding | `utf-8` | str | Encoding/charset | | enable_info | `True` | bool | Enable glossary info / metedata | | resources | `True` | bool | Enable resources / data files | diff --git a/doc/p/kobo.md b/doc/p/kobo.md index 3617ad9e6..179e15047 100644 --- a/doc/p/kobo.md +++ b/doc/p/kobo.md @@ -1,9 +1,10 @@ -## Kobo E-Reader Dictionary +Kobo E-Reader Dictionary +------------------------ ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------- | +|-----------------|------------------------------------------------------------| | Name | Kobo | | snake_case_name | kobo | | Description | Kobo E-Reader Dictionary | @@ -11,9 +12,9 @@ | Read support | No | | Write support | Yes | | Single-file | No | -| Kind | πŸ“¦ package | +| Kind | πŸ“¦ package | | Sort-on-write | never | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Kobo eReader](https://en.wikipedia.org/wiki/Kobo_eReader) | | Website | [www.kobo.com](https://www.kobo.com) | @@ -30,5 +31,5 @@ pip3 install marisa-trie ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ------------------------------------ | ----------- | ----------- | ------------ | -------- | +|--------------------------------------|-------------|-------------|--------------|----------| | [Kobo eReader](https://www.kobo.com) | ― | Proprietary | Kobo eReader | | diff --git a/doc/p/kobo_dictfile.md b/doc/p/kobo_dictfile.md index 91a9e459b..9c01ce269 100644 --- a/doc/p/kobo_dictfile.md +++ b/doc/p/kobo_dictfile.md @@ -1,9 +1,10 @@ -## Kobo E-Reader Dictfile (.df) +Kobo E-Reader Dictfile (.df) +---------------------------- ### General Information | Attribute | Value | -| --------------- | --------------------------------------------------------------------------- | +|-----------------|-----------------------------------------------------------------------------| | Name | Dictfile | | snake_case_name | kobo_dictfile | | Description | Kobo E-Reader Dictfile (.df) | @@ -11,23 +12,23 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [dictgen - dictutil](https://pgaskin.net/dictutil/dictgen/#dictfile-format) | ### Read options | Name | Default | Type | Comment | -| --------------------- | ------- | ---- | --------------------- | +|-----------------------|---------|------|-----------------------| | encoding | `utf-8` | str | Encoding/charset | | extract_inline_images | `True` | bool | Extract inline images | ### Write options | Name | Default | Type | Comment | -| -------- | ------- | ---- | ---------------- | +|----------|---------|------|------------------| | encoding | `utf-8` | str | Encoding/charset | ### Dependencies for reading @@ -43,5 +44,5 @@ pip3 install mistune==3.0.1 ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ------------------------------------------------ | ----------- | ------- | ------------------- | -------- | +|--------------------------------------------------|-------------|---------|---------------------|----------| | [dictgen](https://pgaskin.net/dictutil/dictgen/) | ― | MIT | Linux, Windows, Mac | | diff --git a/doc/p/lingoes_ldf.md b/doc/p/lingoes_ldf.md index 27a1c6667..3b0e1d1e0 100644 --- a/doc/p/lingoes_ldf.md +++ b/doc/p/lingoes_ldf.md @@ -1,9 +1,10 @@ -## Lingoes Source (.ldf) +Lingoes Source (.ldf) +--------------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------- | +|-----------------|---------------------------------------------------------------------| | Name | LingoesLDF | | snake_case_name | lingoes_ldf | | Description | Lingoes Source (.ldf) | @@ -11,27 +12,27 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Lingoes](https://en.wikipedia.org/wiki/Lingoes) | | Website | [Lingoes.net](http://www.lingoes.net/en/dictionary/dict_format.php) | ### Read options | Name | Default | Type | Comment | -| -------- | ------- | ---- | ---------------- | +|----------|---------|------|------------------| | encoding | `utf-8` | str | Encoding/charset | ### Write options | Name | Default | Type | Comment | -| --------- | ------- | ---- | ----------------------------- | +|-----------|---------|------|-------------------------------| | newline | `\n` | str | Newline string | | resources | `True` | bool | Enable resources / data files | ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ---------------------------------------------------------------------------------- | ----------- | ------- | --------- | -------- | +|------------------------------------------------------------------------------------|-------------|---------|-----------|----------| | [Lingoes Dictionary Creator](http://www.lingoes.net/en/dictionary/dict_format.php) | ― | Unknown | | | diff --git a/doc/p/mobi.md b/doc/p/mobi.md index aaa7beb1f..515a6cca0 100644 --- a/doc/p/mobi.md +++ b/doc/p/mobi.md @@ -1,9 +1,10 @@ -## Mobipocket (.mobi) E-Book +Mobipocket (.mobi) E-Book +------------------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------ | +|-----------------|--------------------------------------------------------| | Name | Mobi | | snake_case_name | mobi | | Description | Mobipocket (.mobi) E-Book | @@ -11,7 +12,7 @@ | Read support | No | | Write support | Yes | | Single-file | No | -| Kind | πŸ“¦ package | +| Kind | πŸ“¦ package | | Sort-on-write | default_yes | | Sort key | `ebook` | | Wiki | [Mobipocket](https://en.wikipedia.org/wiki/Mobipocket) | @@ -20,7 +21,7 @@ ### Write options | Name | Default | Type | Comment | -| ---------------------- | -------- | ---- | -------------------------------------------------------------- | +|------------------------|----------|------|----------------------------------------------------------------| | keep | `False` | bool | Keep temp files | | group_by_prefix_length | `2` | int | Prefix length for grouping | | css | | str | Path to css file | @@ -38,7 +39,7 @@ Install [KindleGen](https://wiki.mobileread.com/wiki/KindleGen) for creating Mob ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------------------------- | ----------- | ----------- | ------------------- | -------- | +|----------------------------------------------------------------------------|-------------|-------------|---------------------|----------| | [Amazon Kindle](https://www.amazon.com/kindle) | ― | Proprietary | Amazon Kindle | | | [calibre](https://calibre-ebook.com/) | ― | GPL | Linux, Windows, Mac | | | [Okular](https://okular.kde.org/) | ― | GPL | Linux, Windows, Mac | | diff --git a/doc/p/octopus_mdict.md b/doc/p/octopus_mdict.md index f60219408..6dbde208a 100644 --- a/doc/p/octopus_mdict.md +++ b/doc/p/octopus_mdict.md @@ -1,26 +1,27 @@ -## Octopus MDict (.mdx) +Octopus MDict (.mdx) +-------------------- ### General Information -| Attribute | Value | -| --------------- | --------------------------------------------------------------------- | -| Name | OctopusMdict | -| snake_case_name | octopus_mdict | -| Description | Octopus MDict (.mdx) | -| Extensions | `.mdx` | -| Read support | Yes | -| Write support | No | -| Single-file | No | -| Kind | πŸ”’ binary | -| Sort-on-write | default_no | -| Sort key | (`headword_lower`) | -| Wiki | ― | -| Website | [Download \| MDict.cn](https://www.mdict.cn/wp/?page_id=5325&lang=en) | +| Attribute | Value | +|-----------------|----------------------------------------------------------------------| +| Name | OctopusMdict | +| snake_case_name | octopus_mdict | +| Description | Octopus MDict (.mdx) | +| Extensions | `.mdx` | +| Read support | Yes | +| Write support | No | +| Single-file | No | +| Kind | πŸ”’ binary | +| Sort-on-write | default_no | +| Sort key | \(`headword_lower`\) | +| Wiki | ― | +| Website | [Download | MDict.cn](https://www.mdict.cn/wp/?page_id=5325&lang=en) | ### Read options | Name | Default | Type | Comment | -| ------------------- | ------- | ---- | ----------------------------------- | +|---------------------|---------|------|-------------------------------------| | encoding | | str | Encoding/charset | | substyle | `True` | bool | Enable substyle | | same_dir_data_files | `False` | bool | Read data files from same directory | @@ -28,11 +29,10 @@ ### `python-lzo` is required for **some** MDX glossaries. -First try converting your MDX file, if failed (`AssertionError` probably), -then try to install [LZO library and Python binding](../lzo.md). +First try converting your MDX file, if failed (`AssertionError` probably), then try to install [LZO library and Python binding](../lzo.md). ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ------------------------------ | ----------- | ----------- | -------------------------- | -------- | +|--------------------------------|-------------|-------------|----------------------------|----------| | [MDict](https://www.mdict.cn/) | ― | Proprietary | Android, iOS, Windows, Mac | | diff --git a/doc/p/quickdic6.md b/doc/p/quickdic6.md index 58bdf4c44..f2032d9fb 100644 --- a/doc/p/quickdic6.md +++ b/doc/p/quickdic6.md @@ -1,9 +1,10 @@ -## QuickDic version 6 (.quickdic) +QuickDic version 6 (.quickdic) +------------------------------ ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------------------ | +|-----------------|--------------------------------------------------------------------------------| | Name | QuickDic6 | | snake_case_name | quickdic6 | | Description | QuickDic version 6 (.quickdic) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ”’ binary | +| Kind | πŸ”’ binary | | Sort-on-write | never | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [github.com/rdoeffinger/Dictionary](https://github.com/rdoeffinger/Dictionary) | ### Write options | Name | Default | Type | Comment | -| ---------------- | ------- | ---- | --------------------------------------------- | +|------------------|---------|------|-----------------------------------------------| | normalizer_rules | | str | ICU normalizer rules to use for index sorting | ### Dependencies for reading @@ -36,6 +37,6 @@ pip3 install PyICU ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ---------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | ------------------ | --------- | -------- | +|------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|--------------------|-----------|----------| | [Dictionary](https://play.google.com/store/apps/details?id=de.reimardoeffinger.quickdic) | [@rdoeffinger/Dictionary](https://github.com/rdoeffinger/Dictionary) | Apache License 2.0 | Android | Java | | [DictionaryPC](https://github.com/rdoeffinger/DictionaryPC) | [@rdoeffinger/DictionaryPC](https://github.com/rdoeffinger/DictionaryPC) | Apache License 2.0 | Windows | Java | diff --git a/doc/p/sql.md b/doc/p/sql.md index 17aaab5f2..55dd8861d 100644 --- a/doc/p/sql.md +++ b/doc/p/sql.md @@ -1,9 +1,10 @@ -## SQL (.sql) +SQL (.sql) +---------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------- | +|-----------------|------------------------------------------| | Name | Sql | | snake_case_name | sql | | Description | SQL (.sql) | @@ -11,16 +12,16 @@ | Read support | No | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [SQL](https://en.wikipedia.org/wiki/SQL) | | Website | ― | ### Write options | Name | Default | Type | Comment | -| -------------- | ------- | ---- | ---------------------------- | +|----------------|---------|------|------------------------------| | encoding | `utf-8` | str | Encoding/charset | | info_keys | `None` | list | List of dbinfo table columns | | add_extra_info | `True` | bool | Create dbinfo_extra table | diff --git a/doc/p/stardict.md b/doc/p/stardict.md index 27073967f..90062e19f 100644 --- a/doc/p/stardict.md +++ b/doc/p/stardict.md @@ -1,9 +1,10 @@ -## StarDict (.ifo) +StarDict (.ifo) +--------------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------- | +|-----------------|------------------------------------------------------| | Name | Stardict | | snake_case_name | stardict | | Description | StarDict (.ifo) | @@ -11,7 +12,7 @@ | Read support | Yes | | Write support | Yes | | Single-file | No | -| Kind | πŸ“ directory | +| Kind | πŸ“ directory | | Sort-on-write | always | | Sort key | `stardict` | | Wiki | [StarDict](https://en.wikipedia.org/wiki/StarDict) | @@ -20,7 +21,7 @@ ### Read options | Name | Default | Type | Comment | -| -------------- | -------- | ---- | --------------------------------------- | +|----------------|----------|------|-----------------------------------------| | xdxf_to_html | `True` | bool | Convert XDXF entries to HTML | | xsl | `False` | bool | Use XSL transformation | | unicode_errors | `strict` | str | What to do with Unicode decoding errors | @@ -28,7 +29,7 @@ ### Write options | Name | Default | Type | Comment | -| ---------------- | ------- | ---- | ----------------------------------------------- | +|------------------|---------|------|-------------------------------------------------| | large_file | `False` | bool | Use idxoffsetbits=64 bits, for large files only | | dictzip | `True` | bool | Compress .dict file to .dict.dz | | sametypesequence | | str | Definition format: h=html, m=plaintext, x=xdxf | @@ -41,7 +42,7 @@ ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ----------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ | ----------- | ----------------------------------------------------------- | -------- | +|-------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|-------------|-------------------------------------------------------------|----------| | [AyanDict](https://github.com/ilius/ayandict) | [@ilius/ayandict](https://github.com/ilius/ayandict) | GPL | Linux, Windows, Mac | Go | | [The Next Generation GoldenDict](https://github.com/xiaoyifang/goldendict-ng) | [@xiaoyifang/goldendict-ng](https://github.com/xiaoyifang/goldendict-ng) | GPL | Linux, Windows, Mac | C++ | | [GoldenDict](http://goldendict.org/) | [@goldendict/goldendict](https://github.com/goldendict/goldendict) | GPL | Linux, Windows, Mac | C++ | diff --git a/doc/p/stardict_textual.md b/doc/p/stardict_textual.md index 6f1a829f0..cfc982b21 100644 --- a/doc/p/stardict_textual.md +++ b/doc/p/stardict_textual.md @@ -1,9 +1,10 @@ -## StarDict Textual File (.xml) +StarDict Textual File (.xml) +---------------------------- ### General Information | Attribute | Value | -| --------------- | ------------------------------------------------------------------------------------------------------------------------ | +|-----------------|--------------------------------------------------------------------------------------------------------------------------| | Name | StardictTextual | | snake_case_name | stardict_textual | | Description | StarDict Textual File (.xml) | @@ -11,7 +12,7 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | | Sort key | `stardict` | | Wiki | ― | @@ -20,14 +21,14 @@ ### Read options | Name | Default | Type | Comment | -| ------------ | ------- | ---- | ---------------------------- | +|--------------|---------|------|------------------------------| | encoding | `utf-8` | str | Encoding/charset | | xdxf_to_html | `True` | bool | Convert XDXF entries to HTML | ### Write options | Name | Default | Type | Comment | -| -------- | ------- | ---- | ---------------- | +|----------|---------|------|------------------| | encoding | `utf-8` | str | Encoding/charset | ### Dependencies for reading and writing @@ -43,5 +44,5 @@ pip3 install lxml ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ | ------- | ------------------- | -------- | +|----------------------------------------------------------------------------------------------|--------------------------------------------------------------------|---------|---------------------|----------| | [StarDict-Editor (Tools)](https://github.com/huzheng001/stardict-3/blob/master/tools/README) | [@huzheng001/stardict-3](https://github.com/huzheng001/stardict-3) | GPL | Linux, Windows, Mac | C | diff --git a/doc/p/tabfile.md b/doc/p/tabfile.md index 7c62ceb03..6b78c1f02 100644 --- a/doc/p/tabfile.md +++ b/doc/p/tabfile.md @@ -1,9 +1,10 @@ -## Tabfile (.txt, .dic) +Tabfile (.txt, .dic) +-------------------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------------------------- | +|-----------------|----------------------------------------------------------------------------| | Name | Tabfile | | snake_case_name | tabfile | | Description | Tabfile (.txt, .dic) | @@ -11,22 +12,22 @@ | Read support | Yes | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [Tab-separated values](https://en.wikipedia.org/wiki/Tab-separated_values) | | Website | ― | ### Read options | Name | Default | Type | Comment | -| -------- | ------- | ---- | ---------------- | +|----------|---------|------|------------------| | encoding | `utf-8` | str | Encoding/charset | ### Write options | Name | Default | Type | Comment | -| ---------------- | ------- | ---- | --------------------------------------------------------------- | +|------------------|---------|------|-----------------------------------------------------------------| | encoding | `utf-8` | str | Encoding/charset | | enable_info | `True` | bool | Enable glossary info / metedata | | resources | `True` | bool | Enable resources / data files | @@ -36,5 +37,5 @@ ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ | ------- | ------------------- | -------- | +|----------------------------------------------------------------------------------------------|--------------------------------------------------------------------|---------|---------------------|----------| | [StarDict-Editor (Tools)](https://github.com/huzheng001/stardict-3/blob/master/tools/README) | [@huzheng001/stardict-3](https://github.com/huzheng001/stardict-3) | GPL | Linux, Windows, Mac | C | diff --git a/doc/p/wiktextract.md b/doc/p/wiktextract.md index b723ddbd4..3f412bc6b 100644 --- a/doc/p/wiktextract.md +++ b/doc/p/wiktextract.md @@ -1,9 +1,10 @@ -## Wiktextract (.jsonl) +Wiktextract (.jsonl) +-------------------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------------------- | +|-----------------|----------------------------------------------------------------------| | Name | Wiktextract | | snake_case_name | wiktextract | | Description | Wiktextract (.jsonl) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [@tatuylonen/wiktextract](https://github.com/tatuylonen/wiktextract) | ### Read options | Name | Default | Type | Comment | -| --------------- | ---------------- | ---- | ---------------------------------------------- | +|-----------------|------------------|------|------------------------------------------------| | word_title | `False` | bool | Add headwords title to beginning of definition | | pron_color | `gray` | str | Pronunciation color | | gram_color | `green` | str | Grammar color | diff --git a/doc/p/wordnet.md b/doc/p/wordnet.md index 1c57b11fa..9773dcd68 100644 --- a/doc/p/wordnet.md +++ b/doc/p/wordnet.md @@ -1,9 +1,10 @@ -## WordNet +WordNet +------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------------------------- | +|-----------------|----------------------------------------------------------------------------| | Name | Wordnet | | snake_case_name | wordnet | | Description | WordNet | @@ -11,8 +12,8 @@ | Read support | Yes | | Write support | No | | Single-file | No | -| Kind | πŸ“ directory | +| Kind | πŸ“ directory | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [WordNet](https://en.wikipedia.org/wiki/WordNet) | | Website | [WordNet - A Lexical Database for English](https://wordnet.princeton.edu/) | diff --git a/doc/p/wordset.md b/doc/p/wordset.md index e6087c5b2..012974da7 100644 --- a/doc/p/wordset.md +++ b/doc/p/wordset.md @@ -1,9 +1,10 @@ -## Wordset.org JSON directory +Wordset.org JSON directory +-------------------------- ### General Information | Attribute | Value | -| --------------- | ---------------------------------------------------------------------------- | +|-----------------|------------------------------------------------------------------------------| | Name | Wordset | | snake_case_name | wordset | | Description | Wordset.org JSON directory | @@ -11,14 +12,14 @@ | Read support | Yes | | Write support | No | | Single-file | No | -| Kind | πŸ“ directory | +| Kind | πŸ“ directory | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | ― | | Website | [@wordset/wordset-dictionary](https://github.com/wordset/wordset-dictionary) | ### Read options | Name | Default | Type | Comment | -| -------- | ------- | ---- | ---------------- | +|----------|---------|------|------------------| | encoding | `utf-8` | str | Encoding/charset | diff --git a/doc/p/xdxf.md b/doc/p/xdxf.md index 6c00901bd..a96ade4b3 100644 --- a/doc/p/xdxf.md +++ b/doc/p/xdxf.md @@ -1,9 +1,10 @@ -## XDXF (.xdxf) +XDXF (.xdxf) +------------ ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------------------------------------------------------------- | +|-----------------|----------------------------------------------------------------------------------------------------------------| | Name | Xdxf | | snake_case_name | xdxf | | Description | XDXF (.xdxf) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [XDXF](https://en.wikipedia.org/wiki/XDXF) | | Website | [XDXF standard - @soshial/xdxf_makedict](https://github.com/soshial/xdxf_makedict/tree/master/format_standard) | ### Read options | Name | Default | Type | Comment | -| ---- | ------- | ---- | ---------------------- | +|------|---------|------|------------------------| | html | `True` | bool | Entries are HTML | | xsl | `False` | bool | Use XSL transformation | @@ -37,7 +38,7 @@ pip3 install lxml ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| -------------------------------------------------------------------- | ------------------------------------------------------------------ | ----------- | ---------------------------- | -------- | +|----------------------------------------------------------------------|--------------------------------------------------------------------|-------------|------------------------------|----------| | [GoldenDict by xiaoyifang](https://github.com/xiaoyifang/goldendict) | [@xiaoyifang/goldendict](https://github.com/xiaoyifang/goldendict) | GPL | Linux, Windows, Mac | C++ | | [GoldenDict](http://goldendict.org/) | [@goldendict/goldendict](https://github.com/goldendict/goldendict) | GPL | Linux, Windows, Mac | C++ | | [QTranslate](https://quest-app.appspot.com/) | ― | Proprietary | Windows | C++ | diff --git a/doc/p/xdxf_css.md b/doc/p/xdxf_css.md index 59592fcf3..1fb298f3e 100644 --- a/doc/p/xdxf_css.md +++ b/doc/p/xdxf_css.md @@ -1,9 +1,10 @@ -## XDXF with CSS and JS +XDXF with CSS and JS +-------------------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------------------------------------------------------------- | +|-----------------|----------------------------------------------------------------------------------------------------------------| | Name | XdxfCss | | snake_case_name | xdxf_css | | Description | XDXF with CSS and JS | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [XDXF](https://en.wikipedia.org/wiki/XDXF) | | Website | [XDXF standard - @soshial/xdxf_makedict](https://github.com/soshial/xdxf_makedict/tree/master/format_standard) | ### Read options | Name | Default | Type | Comment | -| ---- | ------- | ---- | ---------------- | +|------|---------|------|------------------| | html | `True` | bool | Entries are HTML | ### Dependencies for reading diff --git a/doc/p/xdxf_lax.md b/doc/p/xdxf_lax.md index 735497a26..8176f9949 100644 --- a/doc/p/xdxf_lax.md +++ b/doc/p/xdxf_lax.md @@ -1,9 +1,10 @@ -## XDXF Lax (.xdxf) +XDXF Lax (.xdxf) +---------------- ### General Information | Attribute | Value | -| --------------- | -------------------------------------------------------------------------------------------------------------- | +|-----------------|----------------------------------------------------------------------------------------------------------------| | Name | XdxfLax | | snake_case_name | xdxf_lax | | Description | XDXF Lax (.xdxf) | @@ -11,16 +12,16 @@ | Read support | Yes | | Write support | No | | Single-file | Yes | -| Kind | πŸ“ text | +| Kind | πŸ“ text | | Sort-on-write | default_no | -| Sort key | (`headword_lower`) | +| Sort key | \(`headword_lower`\) | | Wiki | [XDXF](https://en.wikipedia.org/wiki/XDXF) | | Website | [XDXF standard - @soshial/xdxf_makedict](https://github.com/soshial/xdxf_makedict/tree/master/format_standard) | ### Read options | Name | Default | Type | Comment | -| ---- | ------- | ---- | ---------------------- | +|------|---------|------|------------------------| | html | `True` | bool | Entries are HTML | | xsl | `False` | bool | Use XSL transformation | diff --git a/doc/p/yomichan.md b/doc/p/yomichan.md index 3b6a336a0..b660cecdb 100644 --- a/doc/p/yomichan.md +++ b/doc/p/yomichan.md @@ -1,9 +1,10 @@ -## Yomichan (.zip) +Yomichan (.zip) +--------------- ### General Information | Attribute | Value | -| --------------- | ----------------------------------------------------- | +|-----------------|-------------------------------------------------------| | Name | Yomichan | | snake_case_name | yomichan | | Description | Yomichan (.zip) | @@ -11,7 +12,7 @@ | Read support | No | | Write support | Yes | | Single-file | Yes | -| Kind | πŸ“¦ package | +| Kind | πŸ“¦ package | | Sort-on-write | always | | Sort key | `headword` | | Wiki | ― | @@ -19,20 +20,20 @@ ### Write options -| Name | Default | Type | Comment | -| ---------------------------- | ------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| term_bank_size | `10000` | int | The number of terms in each term bank json file. | -| term_from_headword_only | `True` | bool | If set to true, only create a term for the headword for each entry, as opposed to create one term for each alternate word. If the headword is ignored by the `ignore_word_with_pattern` option, the next word in the alternate list that is not ignored is used as headword. | -| no_term_from_reading | `True` | bool | When there are multiple alternate words, don't create term for the one that is the same as the the reading form, which is chosen to be the first alternate forms that consists solely of Hiragana and Katakana. For example, an entry could contain both 'γ γ„γŒγ' and '倧学' as alternate words. Setting this option to true would prevent a term to be created for the former. | -| delete_word_pattern | | str | When given, all non-overlapping matches of this regular expression are removed from word strings. For example, if an entry has word 'あま·い', setting the pattern to `Β·` removes all center dots, or more precisely use `Β·(?=[\u3040-\u309F])` to only remove center dots that precede Hiragana characters. Either way, the original word is replaced with 'あまい'. | +| Name | Default | Type | Comment | +|------------------------------|---------|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| term_bank_size | `10000` | int | The number of terms in each term bank json file. | +| term_from_headword_only | `True` | bool | If set to true, only create a term for the headword for each entry, as opposed to create one term for each alternate word. If the headword is ignored by the `ignore_word_with_pattern` option, the next word in the alternate list that is not ignored is used as headword. | +| no_term_from_reading | `True` | bool | When there are multiple alternate words, don't create term for the one that is the same as the the reading form, which is chosen to be the first alternate forms that consists solely of Hiragana and Katakana. For example, an entry could contain both 'γ γ„γŒγ' and '倧学' as alternate words. Setting this option to true would prevent a term to be created for the former. | +| delete_word_pattern | | str | When given, all non-overlapping matches of this regular expression are removed from word strings. For example, if an entry has word 'あま·い', setting the pattern to `Β·` removes all center dots, or more precisely use `Β·(?=[\u3040-\u309F])` to only remove center dots that precede Hiragana characters. Either way, the original word is replaced with 'あまい'. | | ignore_word_with_pattern | | str | When given, don't create terms for a word if any of its substrings matches this regular expression. For example, an entry could contain both 'γ γ„γŒγγ€ε€§ε­¦γ€‘' and '倧学' as alternate words. Setting this option with value `r'【.+】'` would prevent a term to be created for the former. | -| alternates_from_word_pattern | | str | When given, the regular expression is used to find additional alternate words for the same entry from matching substrings in the original words. If there are no capturing groups in the regular expression, then all matched substrings are added to the list of alternate words. If there are capturing groups, then substrings matching the groups are added to the alternate words list instead. For example, if an entry has 'γ γ„γŒγγ€ε€§ε­¦γ€‘' as a word, then `\w+(?=【)` adds 'γ γ„γŒγ' as an additional word, while `(\w+)【(\w+)】` adds both 'γ γ„γŒγ' and '倧学'. | +| alternates_from_word_pattern | | str | When given, the regular expression is used to find additional alternate words for the same entry from matching substrings in the original words. If there are no capturing groups in the regular expression, then all matched substrings are added to the list of alternate words. If there are capturing groups, then substrings matching the groups are added to the alternate words list instead. For example, if an entry has 'γ γ„γŒγγ€ε€§ε­¦γ€‘' as a word, then `\w+(?=【)` adds 'γ γ„γŒγ' as an additional word, while `(\w+)【(\w+)】` adds both 'γ γ„γŒγ' and '倧学'. | | alternates_from_defi_pattern | | str | When given, the regular expression is used to find additional alternate words for the same entry from matching substrings in the definition. `^` and `$` can be used to match start and end of lines, respectively. If there are no capturing groups in the regular expression, then all matched substrings are added to the list of alternate words. If there are capturing groups, then substrings matching the groups are added to the alternate words list instead. For example, if an entry has 'γ γ„γŒγγ€ε€§ε­¦γ€‘' in its definition, then `\w+【(\w+)】` adds '倧学' as an additional word. | -| rule_v1_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as ichidan verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^\(ε‹•[δΈŠδΈ‹]δΈ€\)$` identifies entries where there's a line of '(ε‹•δΈŠδΈ€)' or '(ε‹•δΈ‹δΈ€)'. | -| rule_v5_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as godan verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^\(ε‹•δΊ”\)$` identifies entries where there's a line of '(ε‹•δΊ”)'. | -| rule_vs_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as suru verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^スル$` identifies entries where there's a line of 'スル'. | -| rule_vk_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as kuru verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^\(動カ倉\)$` identifies entries where there's a line of '(動カ倉)'. | -| rule_adji_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as i-adjective. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `r'^\(ε½’\)$'` identify entries where there's a line of '(ε½’)'. | +| rule_v1_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as ichidan verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^\(ε‹•[δΈŠδΈ‹]δΈ€\)$` identifies entries where there's a line of '(ε‹•δΈŠδΈ€)' or '(ε‹•δΈ‹δΈ€)'. | +| rule_v5_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as godan verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^\(ε‹•δΊ”\)$` identifies entries where there's a line of '(ε‹•δΊ”)'. | +| rule_vs_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as suru verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^スル$` identifies entries where there's a line of 'スル'. | +| rule_vk_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as kuru verb. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `^\(動カ倉\)$` identifies entries where there's a line of '(動カ倉)'. | +| rule_adji_defi_pattern | | str | When given, if any substring of an entry's definition matches this regular expression, then the term(s) created from entry are labeled as i-adjective. Yomichan uses this information to match conjugated forms of words. `^` and `$` can be used to match start and end of lines, respectively. For example, setting this option to `r'^\(ε½’\)$'` identify entries where there's a line of '(ε½’)'. | ### Dependencies for writing diff --git a/doc/p/zim.md b/doc/p/zim.md index 2cf99c9ac..9eeef22ce 100644 --- a/doc/p/zim.md +++ b/doc/p/zim.md @@ -1,26 +1,27 @@ -## Zim (.zim, for Kiwix) +Zim (.zim, for Kiwix) +--------------------- ### General Information -| Attribute | Value | -| --------------- | ---------------------------------------------------------------------- | -| Name | Zim | -| snake_case_name | zim | -| Description | Zim (.zim, for Kiwix) | -| Extensions | `.zim` | -| Read support | Yes | -| Write support | No | -| Single-file | Yes | -| Kind | πŸ”’ binary | -| Sort-on-write | default_no | -| Sort key | (`headword_lower`) | -| Wiki | [ZIM (file format)]() | -| Website | [OpenZIM](https://wiki.openzim.org/wiki/OpenZIM) | +| Attribute | Value | +|-----------------|----------------------------------------------------------------------| +| Name | Zim | +| snake_case_name | zim | +| Description | Zim (.zim, for Kiwix) | +| Extensions | `.zim` | +| Read support | Yes | +| Write support | No | +| Single-file | Yes | +| Kind | πŸ”’ binary | +| Sort-on-write | default_no | +| Sort key | \(`headword_lower`\) | +| Wiki | [ZIM (file format)](https://en.wikipedia.org/wiki/ZIM_(file_format)) | +| Website | [OpenZIM](https://wiki.openzim.org/wiki/OpenZIM) | ### Read options | Name | Default | Type | Comment | -| ------------------- | -------- | ---- | ------------------------------------------------------------------- | +|---------------------|----------|------|---------------------------------------------------------------------| | text_unicode_errors | `strict` | str | Unicode Errors for plaintext, values: `strict`, `ignore`, `replace` | | html_unicode_errors | `strict` | str | Unicode Errors for HTML, values: `strict`, `ignore`, `replace` | @@ -37,7 +38,7 @@ pip3 install libzim>=1.0 ### Dictionary Applications/Tools | Name & Website | Source code | License | Platforms | Language | -| ----------------------------------------------------------- | ----------- | ------- | -------------- | -------- | +|-------------------------------------------------------------|-------------|---------|----------------|----------| | [Kiwix Desktop](https://github.com/kiwix/kiwix-desktop) | ― | GPL | Linux, Windows | | | [Kiwix JS](https://github.com/kiwix/kiwix-js) | ― | GPL | Windows | | | [Kiwix Serve](https://github.com/kiwix/kiwix-tools) | ― | GPL | Linux, Windows | | diff --git a/doc/pyicu.md b/doc/pyicu.md index d3b5e2d66..3444f0e1b 100644 --- a/doc/pyicu.md +++ b/doc/pyicu.md @@ -1,21 +1,25 @@ -# [PyICU](https://pyicu.org) +[PyICU](https://pyicu.org) +========================== -## Installation on Linux +Installation on Linux +--------------------- -- Debian `sudo apt-get install python3-icu` -- Ubuntu: `sudo apt install pyicu` -- openSUSE: `sudo zypper install python3-PyICU` -- Fedora: `sudo dnf install python3-pyicu` -- Other distros: - - Install [ICU](https://icu.unicode.org/) >= 4.8 - - Run `sudo pip3 install PyICU` or `pip3 install PyICU --user` +- Debian `sudo apt-get install python3-icu` +- Ubuntu: `sudo apt install pyicu` +- openSUSE: `sudo zypper install python3-PyICU` +- Fedora: `sudo dnf install python3-pyicu` +- Other distros: + - Install [ICU](https://icu.unicode.org/) >= 4.8 + - Run `sudo pip3 install PyICU` or `pip3 install PyICU --user` -## Installation on Android with Termux +Installation on Android with Termux +----------------------------------- -- Run `pkg install libicu` -- Run `pip install PyICU` +- Run `pkg install libicu` +- Run `pip install PyICU` -## Installation on Mac OS +Installation on Mac OS +---------------------- ```sh brew install pkg-config icu4c @@ -27,22 +31,23 @@ unset CC CXX python3 -m pip install PyICU ``` -## Installation on Windows +Installation on Windows +----------------------- -- Open https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyicu +- Open https://www.lfd.uci.edu/~gohlke/pythonlibs/#pyicu -- Download latest file that matches your system: +- Download latest file that matches your system: - - `cp39` for Python 3.9, `cp38` for Python 3.8, etc. - - `win_amd64` for Windows 64-bit, `win32` for Windows 32-bit. + - `cp39` for Python 3.9, `cp38` for Python 3.8, etc. + - `win_amd64` for Windows 64-bit, `win32` for Windows 32-bit. - For example: +For example: - - `PyICU‑2.6‑cp39‑cp39‑win_amd64.whl` for 64-bit with Python 3.9 - - `PyICU‑2.6‑cp39‑cp39‑win32.whl` for 32-bit with Python 3.9 +- `PyICU‑2.6‑cp39‑cp39‑win_amd64.whl` for 64-bit with Python 3.9 +- `PyICU‑2.6‑cp39‑cp39‑win32.whl` for 32-bit with Python 3.9 -- Open Start -> type Command -> right-click on Command Prompt -> Run as administrator +- Open Start -> type Command -> right-click on Command Prompt -> Run as administrator -- Type `pip install ` +- Type `pip install` - then drag-and-drop downloaded file into Command Prompt and press Enter. +then drag-and-drop downloaded file into Command Prompt and press Enter. diff --git a/doc/releases/3.0.0.md b/doc/releases/3.0.0.md index bbe747c1f..f0569a69c 100644 --- a/doc/releases/3.0.0.md +++ b/doc/releases/3.0.0.md @@ -1,101 +1,110 @@ -# Changes since version 2016.03.18 +Changes since version 2016.03.18 +================================ -## New versioning +New versioning +-------------- -- Using *date* as the version was a mistake I made 7 years ago -- From now on, versions are in **X.Y.Z** format (*major.minor.patch*) -- While X, Y and Z are digits(0-9) for simplicity (version strings can be compared alphabetically) -- Starting from 3.0.0 - - Take it for migrating to Python 3.x, or Gtk 3.x, or being alphabetically larger than previous versions (date string) +- Using *date* as the version was a mistake I made 7 years ago +- From now on, versions are in **X.Y.Z** format (*major.minor.patch*\) +- While X, Y and Z are digits(0-9) for simplicity (version strings can be compared alphabetically) +- Starting from 3.0.0 + - Take it for migrating to Python 3.x, or Gtk 3.x, or being alphabetically larger than previous versions (date string) Since I believe this is the first *standard version*, I'm not sure which code revision should I compare it with. So I just write the most important recent changes, in both application-view and library-view. -## Breaking Compatibility - -- **Config migration** - - Config file becomes a **config directory** containing config file - - Config file format changes from Python (loaded by `exec`) to **JSON** - - Remove some obsolete / unused config parameters, and rename some - - Remove permanent `sort` boolean flag - - Must give `--sort` in command line to enable sorting for most of output formats - - Load user-defined plugins from a directory named `plugins` inside config directory -- **Glossary class** - - Remove some obsolete / unused method - - `copy`, `attach`, `merge`, `deepMerge`, `takeWords`, `getInputList`, `getOutputList` - - Rename some methods: - - `reverseDic` -> `reverse` - - Make some public attributes private: - - `data` -> `_data` - - `info` -> `_info` - - `filename` -> `_filename` - - Clear (reset) the Glossary instance (data, info, etc) after `write` operation - - Glossary class is for converting from file(s) to file, not keeping data in memory - - New methods: - - `convert`: - - `convert` method is added to be used instead of `read` and then `write` - - Not just for convenience, but it's also recommended, - - and let's Glossary class to have a better default behavior - - for example it enables *direct* mode by default (stay tuned) if sorting is not enabled (by user or plugin) - - all UI modules (Command line, Gtk3, Tkinter) use Glossary.convert method now - - Sorting policy - - `sort` boolean flag is now an argument to `write` method - - sort=True if user gives `--sort` in command line - - sort=False if user gives `--no-sort` in command line - - sort=None if user does not give either, so `write` method itself decides what to do - - Now we allow plugins to specify sorting policy based on output format - - By `sortOnWrite` variable in plugin, with allowed values: - - `ALWAYS`: force sorting even if sort=False (user gives `--no-sort`), used only for writing StarDict - - `DEFAULT_YES`: enable sorting unless sort=False (user gives `--no-sort`) - - `DEFAULT_NO`: disable sorting unless sort=True (user gives `--sort`) - - `NEVER`: disable sorting even if sort=True (user gives `--sort`) - - The default and common value is: `sortOnWrite = DEFAULT_NO` - - Plugin can also have a global `sortKey` function to be used for sorting - - (like the `key` argument to `list.sort` method, See `pydoc list.sort`) - - New way of interacting with Glossary instance in plugins: - - `glos.data.append((word, defi))` -> `glos.addEntry(word, defi)` - - `for item in glos.data:` -> `for entry in glos:` - - `for key, value in glos.info.items():` -> `for key, value in glos.iterInfo():` - -## Gtk2 to Gtk3 - -- Replace obsolete PyGTK-based interface with a simpler PyGI-based (Gtk3) interface - -## Migrating to Python 3 - -- Even though `master` branch was based on Python 3 since 2016 Apr 29, there was some problem that are fixed in this release -- If you are still forced need to use Python 2.7, you can use branch `python2.7` - -## Introducing Direct mode - -- `--direct` command line option -- reads and writes at the same time, without loading the whole data into memory -- Partial sorting is supported - - `--sort` in command line - - `--sort-cache-size=1000` is optional -- If plugin defines sortOnWrite=ALWAYS, it means output format requires full sorting, so direct mode will be disabled -- As mentioned above (using `Glossary.convert` method), direct mode is enabled by default if sorting is not enabled (by user or plugin) -- Of course user can manually disable direct mode by giving `--indirect` option in command line - -## Progress Bar +Breaking Compatibility +---------------------- + +- **Config migration** + - Config file becomes a **config directory** containing config file + - Config file format changes from Python (loaded by `exec`) to **JSON** + - Remove some obsolete / unused config parameters, and rename some + - Remove permanent `sort` boolean flag + - Must give `--sort` in command line to enable sorting for most of output formats + - Load user-defined plugins from a directory named `plugins` inside config directory +- **Glossary class** + - Remove some obsolete / unused method + - `copy`, `attach`, `merge`, `deepMerge`, `takeWords`, `getInputList`, `getOutputList` + - Rename some methods: + - `reverseDic` -> `reverse` + - Make some public attributes private: + - `data` -> `_data` + - `info` -> `_info` + - `filename` -> `_filename` + - Clear (reset) the Glossary instance (data, info, etc) after `write` operation + - Glossary class is for converting from file(s) to file, not keeping data in memory + - New methods: + - `convert`: + - `convert` method is added to be used instead of `read` and then `write` + - Not just for convenience, but it's also recommended, + - and let's Glossary class to have a better default behavior + - for example it enables *direct* mode by default (stay tuned) if sorting is not enabled (by user or plugin) + - all UI modules (Command line, Gtk3, Tkinter) use Glossary.convert method now + - Sorting policy + - `sort` boolean flag is now an argument to `write` method + - sort=True if user gives `--sort` in command line + - sort=False if user gives `--no-sort` in command line + - sort=None if user does not give either, so `write` method itself decides what to do + - Now we allow plugins to specify sorting policy based on output format + - By `sortOnWrite` variable in plugin, with allowed values: + - `ALWAYS`: force sorting even if sort=False (user gives `--no-sort`), used only for writing StarDict + - `DEFAULT_YES`: enable sorting unless sort=False (user gives `--no-sort`\) + - `DEFAULT_NO`: disable sorting unless sort=True (user gives `--sort`\) + - `NEVER`: disable sorting even if sort=True (user gives `--sort`\) + - The default and common value is: `sortOnWrite = DEFAULT_NO` + - Plugin can also have a global `sortKey` function to be used for sorting + - (like the `key` argument to `list.sort` method, See `pydoc list.sort`\) + - New way of interacting with Glossary instance in plugins: + - `glos.data.append((word, defi))` -> `glos.addEntry(word, defi)` + - `for item in glos.data:` -> `for entry in glos:` + - `for key, value in glos.info.items():` -> `for key, value in glos.iterInfo():` + +Gtk2 to Gtk3 +------------ + +- Replace obsolete PyGTK-based interface with a simpler PyGI-based (Gtk3) interface + +Migrating to Python 3 +--------------------- + +- Even though `master` branch was based on Python 3 since 2016 Apr 29, there was some problem that are fixed in this release +- If you are still forced need to use Python 2.7, you can use branch `python2.7` + +Introducing Direct mode +----------------------- + +- `--direct` command line option +- reads and writes at the same time, without loading the whole data into memory +- Partial sorting is supported + - `--sort` in command line + - `--sort-cache-size=1000` is optional +- If plugin defines sortOnWrite=ALWAYS, it means output format requires full sorting, so direct mode will be disabled +- As mentioned above (using `Glossary.convert` method), direct mode is enabled by default if sorting is not enabled (by user or plugin) +- Of course user can manually disable direct mode by giving `--indirect` option in command line + +Progress Bar +------------ Automatic command line Progress Bar for all input / output formats is now supported -- Implemented based on plugins Reader classes -- Works both for direct mode and indirect mode - - Only one progress bar for direct mode - - Two progress bars for indirect mode (one while reading, one while writing) -- Plugins must not update the progress bar anymore -- Still no progress bar when both `--direct` and `--sort` flags are given, will be fixed later -- User can disable progress bar by giving `--no-progress-bar` option (recommended for Windows users) +- Implemented based on plugins Reader classes +- Works both for direct mode and indirect mode + - Only one progress bar for direct mode + - Two progress bars for indirect mode (one while reading, one while writing) +- Plugins must not update the progress bar anymore +- Still no progress bar when both `--direct` and `--sort` flags are given, will be fixed later +- User can disable progress bar by giving `--no-progress-bar` option (recommended for Windows users) -## BGL Plugin +BGL Plugin +---------- -- BGL plugin works better now (comparing to latest Python 2.7 code), and it's much cleaner too -- I totally refactored the code, made it fully Python3-compatible, and much more easier to understand -- This fixes bytes/str bugs (like Bug [#54](https://github.com/ilius/pyglossary/issues/54)), and CRC check problem for some glossaries (Bug [#55](https://github.com/ilius/pyglossary/issues/55)) -- I'm a fan of micro-commits and I usually hate single-commit refactoring, but this time I had no choice! +- BGL plugin works better now (comparing to latest Python 2.7 code), and it's much cleaner too +- I totally refactored the code, made it fully Python3-compatible, and much more easier to understand +- This fixes bytes/str bugs (like Bug [#54](https://github.com/ilius/pyglossary/issues/54)), and CRC check problem for some glossaries (Bug [#55](https://github.com/ilius/pyglossary/issues/55)\) +- I'm a fan of micro-commits and I usually hate single-commit refactoring, but this time I had no choice! -## Other Changes +Other Changes +------------- **Feature**: Add `encoding` option to read and write drivers of some plain-text formats @@ -103,13 +112,13 @@ Automatic command line Progress Bar for all input / output formats is now suppor **New format** invented and implemented for *later implementation of a Glossary Editor* -- `edlin.py` (*Editable Linked List of Entries*) is optimized for adding/modifying/removing one entry at a time -- while we can save the changes instantly after each modification -- Using the ideas of Doubly Linked List, and Git's hash-based object database +- `edlin.py` (*Editable Linked List of Entries*) is optimized for adding/modifying/removing one entry at a time +- while we can save the changes instantly after each modification +- Using the ideas of Doubly Linked List, and Git's hash-based object database Rewrite non-working **Reverse** functionality -- The old code was messy, not working by default, slow, and language-dependent -- It's much faster and cleaner now +- The old code was messy, not working by default, slow, and language-dependent +- It's much faster and cleaner now -Improve and complete command line help (`-h` or `--help`) +Improve and complete command line help (`-h` or `--help`\) diff --git a/doc/releases/3.0.1.md b/doc/releases/3.0.1.md index 7bd9f0439..f6a8c6915 100644 --- a/doc/releases/3.0.1.md +++ b/doc/releases/3.0.1.md @@ -1,5 +1,6 @@ -# Changes since [3.0.0](./3.0.0.md) +Changes since [3.0.0](./3.0.0.md) +================================= -- Fix some minor bugs in Glossary class -- Fix wrong exist status in command line from `pyglossary.pyw` -- Fix exception in BGL plugin +- Fix some minor bugs in Glossary class +- Fix wrong exist status in command line from `pyglossary.pyw` +- Fix exception in BGL plugin diff --git a/doc/releases/3.0.2.md b/doc/releases/3.0.2.md index e6c4cee15..ad6547af5 100644 --- a/doc/releases/3.0.2.md +++ b/doc/releases/3.0.2.md @@ -1,7 +1,8 @@ -# Changes since [3.0.1](./3.0.1.md) +Changes since [3.0.1](./3.0.1.md) +================================= -- Fix a bug in `setup.py`, making it not to work -- Fix a bug in logger class, occurring when pyglossary is imported as a library -- Fix a few bugs in Octopus MDict reader -- Fix a minor bug in BGL reader -- Update README.md +- Fix a bug in `setup.py`, making it not to work +- Fix a bug in logger class, occurring when pyglossary is imported as a library +- Fix a few bugs in Octopus MDict reader +- Fix a minor bug in BGL reader +- Update README.md diff --git a/doc/releases/3.0.3.md b/doc/releases/3.0.3.md index 4e0ed43ea..6fed211c6 100644 --- a/doc/releases/3.0.3.md +++ b/doc/releases/3.0.3.md @@ -1,8 +1,9 @@ -# Changes since [3.0.2](./3.0.2.md) +Changes since [3.0.2](./3.0.2.md) +================================= -- Fixes in AppleDict plugin -- Improve Tkinter interface: fix Not Responding bug, make window icon colorful -- Fix visual bug in command line Progress Bar (percentage did not become 100.0%) -- BGL reader: add support for `Python < 3.5`, with a warning to install Python 3.5 -- Fixes in Reverse feature -- Update README.md +- Fixes in AppleDict plugin +- Improve Tkinter interface: fix Not Responding bug, make window icon colorful +- Fix visual bug in command line Progress Bar (percentage did not become 100.0%) +- BGL reader: add support for `Python < 3.5`, with a warning to install Python 3.5 +- Fixes in Reverse feature +- Update README.md diff --git a/doc/releases/3.0.4.md b/doc/releases/3.0.4.md index 569240e81..af3585c79 100644 --- a/doc/releases/3.0.4.md +++ b/doc/releases/3.0.4.md @@ -1,35 +1,36 @@ -# Changes since [3.0.3](./3.0.3.md) +Changes since [3.0.3](./3.0.3.md) +================================= -## Changes in `Glossary` code base +Changes in `Glossary` code base +------------------------------- -- Fix critical bug in Glossary: `ZeroDivisionError` if `wordCount < 500`, [#61](https://github.com/ilius/pyglossary/issues/61) -- Bug fix in Glossary.progress: make sure ui.progress is not called with a number more than 1.0 -- Fix non-working write to SQL, [#67](https://github.com/ilius/pyglossary/issues/67) -- Bug fix & Feature: add newline argument to `Glossary.writeTxt` - Because Python's `open` converts (modifies) newlines automatically, [#66](https://github.com/ilius/pyglossary/issues/66) -- Break compatibility about using `Glossary.writeTxt` method - Replace argument `sep` which was a tuple of length two, with two mandatory arguments: `sep1` and `sep2` +- Fix critical bug in Glossary: `ZeroDivisionError` if `wordCount < 500`, [#61](https://github.com/ilius/pyglossary/issues/61) +- Bug fix in Glossary.progress: make sure ui.progress is not called with a number more than 1.0 +- Fix non-working write to SQL, [#67](https://github.com/ilius/pyglossary/issues/67) +- Bug fix & Feature: add newline argument to `Glossary.writeTxt` Because Python's `open` converts (modifies) newlines automatically, [#66](https://github.com/ilius/pyglossary/issues/66) +- Break compatibility about using `Glossary.writeTxt` method Replace argument `sep` which was a tuple of length two, with two mandatory arguments: `sep1` and `sep2` -## Changes in plugins +Changes in plugins +------------------ -- Fix in StarDict plugin: fix some Python3-related errors, [#71](https://github.com/ilius/pyglossary/issues/71) -- Fix in Dict.org plugin: `install` was not working -- Fix in DSL plugin: replace backslash at the end of line with `
`, [#61](https://github.com/ilius/pyglossary/issues/61) -- Fix in SQL plugin: specify `encoding='utf-8'` while opening file for write, [#67](https://github.com/ilius/pyglossary/issues/67) -- Fix in Octopus Mdict Source plugin: specify `encoding='utf-8'` while opening file for read, [#78](https://github.com/ilius/pyglossary/issues/78) -- Fix (probable) bugs of bad newlines in 4 plugins (use `newline` argument to `Glossary.writeTxt`), [#66](https://github.com/ilius/pyglossary/issues/66) - - Octopus MDict Source - - Babylon Source (gls) - - Lingoes Source (LDF) - - Sdictionary Source (sdct) -- Feature in Lingoes Source plugin: add `newline` write option -- Minor fix in AppleDict plugin: fix beautifulsoup4 error message, [#72](https://github.com/ilius/pyglossary/issues/72) -- BGL plugin: better compatibility with Python 3.4 - Fix `CRC check failed` error for some (rare) glossaries with Python 3.4 +- Fix in StarDict plugin: fix some Python3-related errors, [#71](https://github.com/ilius/pyglossary/issues/71) +- Fix in Dict.org plugin: `install` was not working +- Fix in DSL plugin: replace backslash at the end of line with `
`, [#61](https://github.com/ilius/pyglossary/issues/61) +- Fix in SQL plugin: specify `encoding='utf-8'` while opening file for write, [#67](https://github.com/ilius/pyglossary/issues/67) +- Fix in Octopus Mdict Source plugin: specify `encoding='utf-8'` while opening file for read, [#78](https://github.com/ilius/pyglossary/issues/78) +- Fix (probable) bugs of bad newlines in 4 plugins (use `newline` argument to `Glossary.writeTxt`), [#66](https://github.com/ilius/pyglossary/issues/66) + - Octopus MDict Source + - Babylon Source (gls) + - Lingoes Source (LDF) + - Sdictionary Source (sdct) +- Feature in Lingoes Source plugin: add `newline` write option +- Minor fix in AppleDict plugin: fix beautifulsoup4 error message, [#72](https://github.com/ilius/pyglossary/issues/72) +- BGL plugin: better compatibility with Python 3.4 Fix `CRC check failed` error for some (rare) glossaries with Python 3.4 -## Other Changes +Other Changes +------------- -- Bug fix in parsing command line read options`--read-options` and `--write-options` (happened in very rare cases) -- Fix wrong shebang line in setup.py: must run with python3, fix [#75](https://github.com/ilius/pyglossary/issues/75) -- Update `pyglossary.spec` -- Change Categories for `pyglossary.desktop` +- Bug fix in parsing command line read options`--read-options` and `--write-options` (happened in very rare cases) +- Fix wrong shebang line in setup.py: must run with python3, fix [#75](https://github.com/ilius/pyglossary/issues/75) +- Update `pyglossary.spec` +- Change Categories for `pyglossary.desktop` diff --git a/doc/releases/3.1.0.md b/doc/releases/3.1.0.md index 15dd311b3..a5883b64d 100644 --- a/doc/releases/3.1.0.md +++ b/doc/releases/3.1.0.md @@ -1,33 +1,34 @@ -# Changes since [3.0.4](./3.0.4.md) +Changes since [3.0.4](./3.0.4.md) +================================= -- Refactor StarDict plugin, and improve the performance -- Detect HTML definitions when reading, and mark them as HTML when converting to StarDict -- Fix [#135](https://github.com/ilius/pyglossary/issues/135) in StarDict writer: - - Alternates were pointing at a wrong word in case there are resource/image files -- Refactor AppleDict plugin -- Refactor and improve BGL plugin -- Style fixes including pep-8 fixes -- Change indentations to tabs, and single quote to double quotes -- Allow `--ui=none` flag -- Allow `--skip-resources` flag -- SQL plugin: add `encoding` write option -- Octopus MDict Source plugin: add `encoding` read option -- Drop sqlite3 support, xFarDic support, and read support for Omnidic -- Improvement and cleaning in the code base and different plugins -- Introduce DataEntry - - Allowing to access resource files when iterating over entries (words) of Glossary -- Glossary: `write` and `convert` methods return absolute path of output file, or None -- Changes in master branch since [3.0.4](./3.0.4.md): - - Update README.md - - Update pyglossary.spec - - Fixes in setup.py - - BGL: add `gzip_no_crc.py` for Python 36 (required for some non-standard BGL files) - - AppleDict: give `encoding='utf8'` while opening xml file, fix for [#84](https://github.com/ilius/pyglossary/issues/84) - - Avoid lines that require trailing backslash, to avoid bugs like [#67](https://github.com/ilius/pyglossary/issues/67) - - babylon_source.py: remove extra %s, fix [#92](https://github.com/ilius/pyglossary/issues/92) - - AppleDict: force encoding="utf-8" for plist file, fix [#94](https://github.com/ilius/pyglossary/issues/94) - - Fix str/bytes bug in stardict.py (fix [#98](https://github.com/ilius/pyglossary/issues/98)) and some renames for clarification - - Fix [#102](https://github.com/ilius/pyglossary/issues/102): exception in dict_org.py - - Fix wrong path of static files when running from dist-packages - - readmdict.py: change by Xiaoqiang Wang: no encryption if Encrypted is not in header - - Fix [#118](https://github.com/ilius/pyglossary/issues/118), SyntaxError (`return` with argument inside generator) in Glossary.reverse with Python 3.6 +- Refactor StarDict plugin, and improve the performance +- Detect HTML definitions when reading, and mark them as HTML when converting to StarDict +- Fix [#135](https://github.com/ilius/pyglossary/issues/135) in StarDict writer: + - Alternates were pointing at a wrong word in case there are resource/image files +- Refactor AppleDict plugin +- Refactor and improve BGL plugin +- Style fixes including pep-8 fixes +- Change indentations to tabs, and single quote to double quotes +- Allow `--ui=none` flag +- Allow `--skip-resources` flag +- SQL plugin: add `encoding` write option +- Octopus MDict Source plugin: add `encoding` read option +- Drop sqlite3 support, xFarDic support, and read support for Omnidic +- Improvement and cleaning in the code base and different plugins +- Introduce DataEntry + - Allowing to access resource files when iterating over entries (words) of Glossary +- Glossary: `write` and `convert` methods return absolute path of output file, or None +- Changes in master branch since [3.0.4](./3.0.4.md): + - Update README.md + - Update pyglossary.spec + - Fixes in setup.py + - BGL: add `gzip_no_crc.py` for Python 36 (required for some non-standard BGL files) + - AppleDict: give `encoding='utf8'` while opening xml file, fix for [#84](https://github.com/ilius/pyglossary/issues/84) + - Avoid lines that require trailing backslash, to avoid bugs like [#67](https://github.com/ilius/pyglossary/issues/67) + - babylon_source.py: remove extra %s, fix [#92](https://github.com/ilius/pyglossary/issues/92) + - AppleDict: force encoding="utf-8" for plist file, fix [#94](https://github.com/ilius/pyglossary/issues/94) + - Fix str/bytes bug in stardict.py (fix [#98](https://github.com/ilius/pyglossary/issues/98)) and some renames for clarification + - Fix [#102](https://github.com/ilius/pyglossary/issues/102): exception in dict_org.py + - Fix wrong path of static files when running from dist-packages + - readmdict.py: change by Xiaoqiang Wang: no encryption if Encrypted is not in header + - Fix [#118](https://github.com/ilius/pyglossary/issues/118), SyntaxError (`return` with argument inside generator) in Glossary.reverse with Python 3.6 diff --git a/doc/releases/3.2.0.md b/doc/releases/3.2.0.md index 24a047783..3546590dc 100644 --- a/doc/releases/3.2.0.md +++ b/doc/releases/3.2.0.md @@ -1,20 +1,21 @@ -## Changes since [3.1.0](./3.1.0.md) +Changes since [3.1.0](./3.1.0.md) +--------------------------------- -- Add read support for CC-CEDICT plugin +- Add read support for CC-CEDICT plugin - - Pull request [#140](https://github.com/ilius/pyglossary/pull/140), with some fixes and improvements by me + - Pull request [#140](https://github.com/ilius/pyglossary/pull/140), with some fixes and improvements by me -- Fixes in DSL (ABBYY Lingvo) plugin: +- Fixes in DSL (ABBYY Lingvo) plugin: - - Fix [#136](https://github.com/ilius/pyglossary/issues/136), removing one extra character after `#CONTENTS_LANGUAGE:` - - Fix [#137](https://github.com/ilius/pyglossary/issues/137), regexp for re_lang_open + - Fix [#136](https://github.com/ilius/pyglossary/issues/136), removing one extra character after `#CONTENTS_LANGUAGE:` + - Fix [#137](https://github.com/ilius/pyglossary/issues/137), regexp for re_lang_open -- Improvement in Gtk interface: +- Improvement in Gtk interface: - - Avoid changing Format combobox based on file extension if a format is already selected, [#141](https://github.com/ilius/pyglossary/issues/141) + - Avoid changing Format combobox based on file extension if a format is already selected, [#141](https://github.com/ilius/pyglossary/issues/141) -- Fix encoding problem with non-UTF-8 system locales +- Fix encoding problem with non-UTF-8 system locales - - Fix [#147](https://github.com/ilius/pyglossary/issues/147), give encoding="utf-8" when opening text files, for non-UTF-8 system locales + - Fix [#147](https://github.com/ilius/pyglossary/issues/147), give encoding="utf-8" when opening text files, for non-UTF-8 system locales -- Improvements in `Glossary` class +- Improvements in `Glossary` class diff --git a/doc/releases/3.2.1.md b/doc/releases/3.2.1.md index ee4e3ccd8..ab96eaa57 100644 --- a/doc/releases/3.2.1.md +++ b/doc/releases/3.2.1.md @@ -1,16 +1,17 @@ -# Changes since [3.2.0](./3.2.0.md) +Changes since [3.2.0](./3.2.0.md) +================================= -- Changes in StarDict plugin: - - Add sametypesequence write option (PR [#162](https://github.com/ilius/pyglossary/pull/162)) - - Fix some bugs - - Cleaning -- Disable gzip CRC check for BGL files with Python 3.7 -- Fix a bug in octopus_mdict.py -- Fix Gtk warnings in ui_gtk -- Allow seeing/customizing warnings by setting environment variable WARNINGS -- Fix not being able to run the program when installed inside virtualenv ([#168](https://github.com/ilius/pyglossary/issues/168)) -- Show a tip about -h when no UI were found, [#169](https://github.com/ilius/pyglossary/issues/169) -- octopus_mdict_source.py: fix [#68](https://github.com/ilius/pyglossary/issues/68), add support for inconsecutive links with --read-options=links=True -- Auto-detect UTF-16 encoding of DSL files -- Update README.md (fix Archlinux pkg name, add AUR, add instructions for installing python-lzo on Windows, etc) -- Some clean up +- Changes in StarDict plugin: + - Add sametypesequence write option (PR [#162](https://github.com/ilius/pyglossary/pull/162)\) + - Fix some bugs + - Cleaning +- Disable gzip CRC check for BGL files with Python 3.7 +- Fix a bug in octopus_mdict.py +- Fix Gtk warnings in ui_gtk +- Allow seeing/customizing warnings by setting environment variable WARNINGS +- Fix not being able to run the program when installed inside virtualenv ([#168](https://github.com/ilius/pyglossary/issues/168)\) +- Show a tip about -h when no UI were found, [#169](https://github.com/ilius/pyglossary/issues/169) +- octopus_mdict_source.py: fix [#68](https://github.com/ilius/pyglossary/issues/68), add support for inconsecutive links with --read-options=links=True +- Auto-detect UTF-16 encoding of DSL files +- Update README.md (fix Archlinux pkg name, add AUR, add instructions for installing python-lzo on Windows, etc) +- Some clean up diff --git a/doc/releases/3.3.0.md b/doc/releases/3.3.0.md index 880c0daae..ead61ff63 100644 --- a/doc/releases/3.3.0.md +++ b/doc/releases/3.3.0.md @@ -1,111 +1,112 @@ -# Changes since [3.2.1](./3.2.1.md) +Changes since [3.2.1](./3.2.1.md) +================================= -- Require Python 3.6 or higher (mainly because of f-strings) +- Require Python 3.6 or higher (mainly because of f-strings) -- New format support +- New format support - - Add support to write Kobo dictionary, [#205](https://github.com/ilius/pyglossary/issues/205) - - Add support to write EPUB-2 - - Add support to read AppleDict Binary (.dictionary) - - Add support to read and write Aard 2 (slob), [#116](https://github.com/ilius/pyglossary/issues/116) + - Add support to write Kobo dictionary, [#205](https://github.com/ilius/pyglossary/issues/205) + - Add support to write EPUB-2 + - Add support to read AppleDict Binary (.dictionary) + - Add support to read and write Aard 2 (slob), [#116](https://github.com/ilius/pyglossary/issues/116) -- Glossary: detect and load Writer class from plugins +- Glossary: detect and load Writer class from plugins - - Remove write function from plugin if it has Writer class + - Remove write function from plugin if it has Writer class -- Glossary: call `gc.collect()` on indirect mode after reading/writing each 128 entries +- Glossary: call `gc.collect()` on indirect mode after reading/writing each 128 entries - - To free up memory and avoid running out of RAM for large glossaries + - To free up memory and avoid running out of RAM for large glossaries -- Glossary: remove empty and duplicate alternate words when converting, using Entry Filter, [#188](https://github.com/ilius/pyglossary/issues/188) +- Glossary: remove empty and duplicate alternate words when converting, using Entry Filter, [#188](https://github.com/ilius/pyglossary/issues/188) -- Add command line options to remove html tags: +- Add command line options to remove html tags: - - `--remove-html=tag1,tag2,tag3` - - `--remove-html-all` + - `--remove-html=tag1,tag2,tag3` + - `--remove-html-all` -- Re-design format-specific options +- Re-design format-specific options - - Allow specifying format-specific read/write options in ui_gtk and ui_tk - - Add much better and cleaner codebase for handling options in `option.py` - - Implement validation of options in command line, GTK and Tkinter interfaces - - Add tests for `option.py` in `option_test.py` - - Avoid using None as default value of option argument - - Check default value of plugin options and show warning if invalid - - Add IntOption class, use it in Omnidic plugin - - Add DictOption, use it for appledict defaultPrefs - - And `optionsProp` to all plugins - - Containing value type, allowed values and optional comment - - Remove `readOptions` and `writeOptions` from all plugins - - Detect options from functions' signature and `optionsProp` variables - - Avoid using `**kwargs` in plugin `read`, `Reader.open` or `write` functions + - Allow specifying format-specific read/write options in ui_gtk and ui_tk + - Add much better and cleaner codebase for handling options in `option.py` + - Implement validation of options in command line, GTK and Tkinter interfaces + - Add tests for `option.py` in `option_test.py` + - Avoid using None as default value of option argument + - Check default value of plugin options and show warning if invalid + - Add IntOption class, use it in Omnidic plugin + - Add DictOption, use it for appledict defaultPrefs + - And `optionsProp` to all plugins + - Containing value type, allowed values and optional comment + - Remove `readOptions` and `writeOptions` from all plugins + - Detect options from functions' signature and `optionsProp` variables + - Avoid using `**kwargs` in plugin `read`, `Reader.open` or `write` functions -- Add `depends` variable to plugins +- Add `depends` variable to plugins - - To let GUI install plugin dependencies - - Type: `dict`, keys are module names, values are pip's package name - - Add `Glossary.formatsDepends` + - To let GUI install plugin dependencies + - Type: `dict`, keys are module names, values are pip's package name + - Add `Glossary.formatsDepends` -- Minor fixes and improvements in Glossary class: +- Minor fixes and improvements in Glossary class: - - Return with error if output file path is an existing directory - - Fix empty zip when creating `DIRECTORY.zip` as output glossary - - Do not uncompress gz/bz2/zip input files automatically - - Ignore "read" function of plugin if "Reader" class is present - - Cleaning: Add Glossary.init() classmethod to initialize the class, can be called multiple times - - Some refactoring and cleaning, and add some logs - - Small optimization: `index % 100` -> `index & 0x7f` - - Allow having progressbar by position in file and size of file - - use for `appledict_bin.py` - - Do not write resource file names as entries to text file in `Glossary.writeTxt` + - Return with error if output file path is an existing directory + - Fix empty zip when creating `DIRECTORY.zip` as output glossary + - Do not uncompress gz/bz2/zip input files automatically + - Ignore "read" function of plugin if "Reader" class is present + - Cleaning: Add Glossary.init() classmethod to initialize the class, can be called multiple times + - Some refactoring and cleaning, and add some logs + - Small optimization: `index % 100` -> `index & 0x7f` + - Allow having progressbar by position in file and size of file + - use for `appledict_bin.py` + - Do not write resource file names as entries to text file in `Glossary.writeTxt` -- StarDict plugin +- StarDict plugin - - Always open `.ifo` file as UTF-8 - - Fix output filenames without .ifo extension creating hidden files, [#187](https://github.com/ilius/pyglossary/issues/187) + - Always open `.ifo` file as UTF-8 + - Fix output filenames without .ifo extension creating hidden files, [#187](https://github.com/ilius/pyglossary/issues/187) -- Babylon BGL plugin +- Babylon BGL plugin - - Fix bytes metedata values `b'...'` and some refactoring in readType3 - - Skip empty info values - - Fix non-string info values written as empty - - Prefix 3 info keys with `bgl_` - - Fix NameError in debug mode in `stripHtmlTags` - - Some refactoring + - Fix bytes metedata values `b'...'` and some refactoring in readType3 + - Skip empty info values + - Fix non-string info values written as empty + - Prefix 3 info keys with `bgl_` + - Fix NameError in debug mode in `stripHtmlTags` + - Some refactoring -- Octopus MDict plugin +- Octopus MDict plugin - - Fix Python 3 bug in `readmdict.py`: https://bitbucket.org/xwang/mdict-analysis/commits/8f66c30 - - Support multiple mdd files ([#203](https://github.com/ilius/pyglossary/issues/203)) + - Fix Python 3 bug in `readmdict.py`: https://bitbucket.org/xwang/mdict-analysis/commits/8f66c30 + - Support multiple mdd files ([#203](https://github.com/ilius/pyglossary/issues/203)\) -- Change yes/no options in AppleDict and ABBYY Lingvo DSL plugins to boolean +- Change yes/no options in AppleDict and ABBYY Lingvo DSL plugins to boolean - - To keep compatibility of command line flags, fix yes/no manually in ui_cmd.py + - To keep compatibility of command line flags, fix yes/no manually in ui_cmd.py -- AppleDict plugin: +- AppleDict plugin: - - Fix `echo` problem in `Makefile` ([#177](https://github.com/ilius/pyglossary/issues/177)) - - Add dark mode support for AppleDict output ([#177](https://github.com/ilius/pyglossary/issues/177)) - - Add comments for `optionsProp` - - Use keyword argument `features=` and fix a warning about from_encoding= + - Fix `echo` problem in `Makefile` ([#177](https://github.com/ilius/pyglossary/issues/177)\) + - Add dark mode support for AppleDict output ([#177](https://github.com/ilius/pyglossary/issues/177)\) + - Add comments for `optionsProp` + - Use keyword argument `features=` and fix a warning about from_encoding= -- Fix misspelled "extension" (as "extension") in plugins +- Fix misspelled "extension" (as "extension") in plugins -- Detect entries with `span` tag as html, [#193](https://github.com/ilius/pyglossary/issues/193) +- Detect entries with `span` tag as html, [#193](https://github.com/ilius/pyglossary/issues/193) -- Refactoring in ui_gtk and ui_tk +- Refactoring in ui_gtk and ui_tk -- Fix some deprecated API in ui_gtk +- Fix some deprecated API in ui_gtk -- Fix minor bugs and improvements in ui_tk and ui_gtk +- Fix minor bugs and improvements in ui_tk and ui_gtk -- Update setup.py to adapt packaging with wheel, [#189](https://github.com/ilius/pyglossary/issues/189) +- Update setup.py to adapt packaging with wheel, [#189](https://github.com/ilius/pyglossary/issues/189) -- Add type hints to codebase and plugins +- Add type hints to codebase and plugins -- Refactoring and style changes: +- Refactoring and style changes: - - rename `pyglossary.pyw` to main.py, add a small `pyglossary.pyw` for compatibility - - Switch to f-strings in glossary.py and freedict.py - - main.py: replace single quotes with double quotes - - PEP-8 style fixes + - rename `pyglossary.pyw` to main.py, add a small `pyglossary.pyw` for compatibility + - Switch to f-strings in glossary.py and freedict.py + - main.py: replace single quotes with double quotes + - PEP-8 style fixes diff --git a/doc/releases/4.0.0.md b/doc/releases/4.0.0.md index 8004cb042..382eddb50 100644 --- a/doc/releases/4.0.0.md +++ b/doc/releases/4.0.0.md @@ -1,287 +1,291 @@ -# Changes since [3.3.0](./3.3.0.md) +Changes since [3.3.0](./3.3.0.md) +================================= -- Require Python 3.7 or 3.8, drop support for Python 3.4, 3.5 and 3.6 +- Require Python 3.7 or 3.8, drop support for Python 3.4, 3.5 and 3.6 -- Fix / rewrite `setup.py` +- Fix / rewrite `setup.py` - - Fix `python3 setup.py sdist bdist_wheel`, and pypi package - - Had to move `ui/` directory into `pyglossary/` - - Switch from `distutils` to `setuptools` - - Remove `py2exe` + - Fix `python3 setup.py sdist bdist_wheel`, and pypi package + - Had to move `ui/` directory into `pyglossary/` + - Switch from `distutils` to `setuptools` + - Remove `py2exe` -- Add interactive command line user interface +- Add interactive command line user interface - - Automatically selected if input & output file arguments are not passed **and** one of these: - - On Linux and no `$DISPLAY` is not set - - On Mac and no `tkinter` module is found - - `--ui=cmd` flag is passed + - Automatically selected if input & output file arguments are not passed **and** one of these: + - On Linux and no `$DISPLAY` is not set + - On Mac and no `tkinter` module is found + - `--ui=cmd` flag is passed -- New format support: +- New format support: - - Add read support for FreeDict, [#206](https://github.com/ilius/pyglossary/issues/206) - - Add read support for Zim (Kiwix) - - Add read and write support for Kobo E-Reader Dictfile (.df) - - Add write support for DICT.org `dictfmt` source file - - Add read support for [dictunformat](https://linux.die.net/man/1/dictunformat) output file - - Add write support for JSON - - Add read support for Dict.cc (SQLite3) - - Add read support for [JMDict](https://www.edrdg.org/jmdict/j_jmdict.html), [#239](https://github.com/ilius/pyglossary/issues/239) - - Add basic read support for Wiktionary Dump (.xml) - - Add read support for [cc-kedict](https://github.com/mhagiwara/cc-kedict) - - Add read support for [DigitalNK](https://github.com/digitalprk/dicrs) (SQLite3) - - Add read support for [Wordset.org](https://github.com/wordset/wordset-dictionary) JSON directory + - Add read support for FreeDict, [#206](https://github.com/ilius/pyglossary/issues/206) + - Add read support for Zim (Kiwix) + - Add read and write support for Kobo E-Reader Dictfile (.df) + - Add write support for DICT.org `dictfmt` source file + - Add read support for [dictunformat](https://linux.die.net/man/1/dictunformat) output file + - Add write support for JSON + - Add read support for Dict.cc (SQLite3) + - Add read support for [JMDict](https://www.edrdg.org/jmdict/j_jmdict.html), [#239](https://github.com/ilius/pyglossary/issues/239) + - Add basic read support for Wiktionary Dump (.xml) + - Add read support for [cc-kedict](https://github.com/mhagiwara/cc-kedict) + - Add read support for [DigitalNK](https://github.com/digitalprk/dicrs) (SQLite3) + - Add read support for [Wordset.org](https://github.com/wordset/wordset-dictionary) JSON directory -- Remove Omnidic write support (Unmaintained J2ME dictionary) +- Remove Omnidic write support (Unmaintained J2ME dictionary) -- Remove Octopus MDict Source plugin +- Remove Octopus MDict Source plugin -- Remove Babylon Source plugin +- Remove Babylon Source plugin -- BGL Weader: improvements +- BGL Weader: improvements -- DictionaryForMIDs Writer: fix non-working code +- DictionaryForMIDs Writer: fix non-working code -- Gettext Source (po) Writer: fix info header +- Gettext Source (po) Writer: fix info header -- MOBI E-Book Writer: fix sort order, fix and test kindlegen codes, add `kindlegen_path` option, [#112](https://github.com/ilius/pyglossary/issues/112) +- MOBI E-Book Writer: fix sort order, fix and test kindlegen codes, add `kindlegen_path` option, [#112](https://github.com/ilius/pyglossary/issues/112) -- EPUB-2 E-Book Writer: fix sort order +- EPUB-2 E-Book Writer: fix sort order -- XDXF Reader: rewrite with `etree.iterparse` to avoid using too much RAM +- XDXF Reader: rewrite with `etree.iterparse` to avoid using too much RAM -- Lingoes Source (LDF) Reader: fix ignoring info/metadata header +- Lingoes Source (LDF) Reader: fix ignoring info/metadata header -- dict_org.py: rewrite broken plugin (Reader and Writer) +- dict_org.py: rewrite broken plugin (Reader and Writer) -- DSL Reader: fix losing metadata/info +- DSL Reader: fix losing metadata/info -- Aard 2 (slob) Reader: +- Aard 2 (slob) Reader: - - Fix adding css/js files as normal entries - - Add `bword://` prefix to entry links - - Fix duplicate entries issue by keeping a set of blob IDs, [#224](https://github.com/ilius/pyglossary/issues/224) - - Detect and pass defiFormat + - Fix adding css/js files as normal entries + - Add `bword://` prefix to entry links + - Fix duplicate entries issue by keeping a set of blob IDs, [#224](https://github.com/ilius/pyglossary/issues/224) + - Detect and pass defiFormat -- Aard 2 (slob) Writer: +- Aard 2 (slob) Writer: - - Fix content_type detection - - Remove `bword://` prefix from entry links - - Add resource files / data entries, [#243](https://github.com/ilius/pyglossary/issues/243) - - Fix replacing image paths - - Show log events from `slob.py` in debug mode - - Change default `compression` to `zlib` - - Allow passing empty `compression` + - Fix content_type detection + - Remove `bword://` prefix from entry links + - Add resource files / data entries, [#243](https://github.com/ilius/pyglossary/issues/243) + - Fix replacing image paths + - Show log events from `slob.py` in debug mode + - Change default `compression` to `zlib` + - Allow passing empty `compression` -- Octopus MDict Reader: +- Octopus MDict Reader: - - Read MDX file twice to load links - - Count data entries as part of `len(reader)` for progressbar + - Read MDX file twice to load links + - Count data entries as part of `len(reader)` for progressbar -- StarDict Writer: +- StarDict Writer: - - Copy "copyright" and "publisher" values to "description" - - Add source and target language codes to the end of bookname - - Add write-option `stardict_client: bool` - Set `True` to make glossary more compatible with StarDict 3.x - - Fix broken result when `sametypesequence` option is given and a definitions contains `|` - - Allow `sametypesequence=x` for xdxf - - Add `merge_syns` option - - Allow `sametypesequence=None` option + - Copy "copyright" and "publisher" values to "description" + - Add source and target language codes to the end of bookname + - Add write-option `stardict_client: bool` Set `True` to make glossary more compatible with StarDict 3.x + - Fix broken result when `sametypesequence` option is given and a definitions contains `|` + - Allow `sametypesequence=x` for xdxf + - Add `merge_syns` option + - Allow `sametypesequence=None` option -- XDXF Reader: +- XDXF Reader: - - Fix/improve xdxf to html transformation + - Fix/improve xdxf to html transformation -- Kobo Writer: +- Kobo Writer: - - Fix get_prefix algorithm and sorting order, with tests, [#219](https://github.com/ilius/pyglossary/issues/219) - - Replace ` "Generator[None, BaseEntry, None]"` - - - Entries must be fetched with `entry = yield` in a `while True` loop: - - ```python - while True: - entry = yield - if entry is None: - break - # process and write entry into file(s) - ``` + - Replace `entry://` with `bword://` in MDX Reader instead of AppleDict Writer + - Fix internal `href="x:"` and `href="d:"` links + - Fix `file://` in images path, fix [#243](https://github.com/ilius/pyglossary/issues/243) - - `finish(self)` - - Read options and write options must be set to their default values as class attributes - - See `pyglossary/plugins/csv_pyg.py` plugin for example - - `sortKey` must be an instance method of Writer, instead of a function outside any class - - Only for plugins that need sorting before write +- User Interface improvements and fixes: -- Refactor and cleanup `Glossary` class + - ui_gtk: add About tab and more improvements + - ui_tk: replace About dialog with About tab and more improvements + - ui_cmd: improvements in progressbar + - ui_cmd: allow "=" in value of read/write options - - Removed or replaced most of class/static attributes of `Glossary` - - To see the diff, run `git diff [3.3.0](./3.3.0.md)..master -- pyglossary/glossary.py` - - Removed `glos.addEntry` method - - If you use it in your program, replace with `glos.addEntryObj(glos.newEntry(word, defi, defiFormat))` - - Removed instance methods: - - `getMostUsedDefiFormats` - - `iterEntryBuckets` - - `zipOutDir` and `archiveOutDir` - - Moved to `pyglossary/glossary_utils.py` - - `archiveOutDir` renamed to `compressOutDir` - - `writeDict` - - `iterSqlLines` -> moved to `pyglossary/plugins/sql.py` - - `reverse`, `takeOutputWords`, `searchWordInDef` -> moved to `pyglossary/reverse.py` - - Values of `Glossary.plugins` is changed to `plugin_prop.PluginProp` instances - - Change `glos.writeTxt` arguments - - Replace `sep1` and `sep2` with `entryFmt` - - Replace `rplList` with `defiEscapeFunc`, `wordEscapeFunc` and `tail` - - Remove `iterEntries`, `entryFilterFunc` - - Method returns `Generator[None, BaseEntry, None]` instead of `bool` - - See for usage example: - - `pyglossary/glossary.py` -> `def writeTabfile` - - `pyglossary/plugins/dict_org_source.py` - - `pyglossary/plugins/json_plugin.py` - - `pyglossary/plugins/lingoes_ldf.py` - - `pyglossary/plugins/sdict_source.py` +- Add a list of 208 languages and ~40 writing systems -- Refactor, cleanup and fixes in `Entry` and `DataEntry` classes + - Detect `sourceLang` and `targetLang` from glossary name/title + - Auto-select between `` and `` tags depending on writing system + - Using `glos.titleElement` method, used in FreeDict, JMDict and Dict.cc writers + - `glos.sourceLang` and `glos.targetLang` properties (with setters) as `Lang` objects + - `glos.sourceLangName` and `glos.targetLangName` properties (with setters) as `str` + - Used in several plugins - - Replace `entry.getWord()` with `entry.word` - - Replace `entry.getWords()` with `entry.l_word` - - Replace `entry.getDefi()` with `entry.defi` - - Remove `entry.getDefis()` - - Drop handling alternate definitions in `Entry` objects - - Replace `entry.getDefiFormat()` with `entry.defiFormat` - - Add `entry.b_word` and `entry.b_defi` shortcuts that give `bytes` (UTF-8) - - Replace `dataEntry.getData()` with `dataEntry.data` - - Add `__slots__` to Entry and DataEntry classes - - Fix `DataEntry` in indirect mode - - Mistaken for Entry with defi=DATA, and file content discarded - - Save resource files in user's cache directory when loading input glossary into memory - - Move file to output glossary on `dataEntry.save(...)` - - Fix `Entry.getRawEntrySortKey` not being alternates-aware, broke StarDict Writer - - `DataEntry`: save: use `shutil.copy` if has `_tmpPath`, and set `_tmpPath` +- Break compatibility of plugins -- New features of `Entry` - - - `entry.stripFullHtml()`, remove `......` - - Used in Kobo and Kobo Dictfile writers - - Add tests - -- Fix `glos.writeTabfile`: - - - Remove `\r` from definitions and info values - - Fix not escaping word - -- Fix/improve html detection in definitions - -- Switch to lazy imports of non-standard modules in plugins - -- Optimize RAM usage of indirect conversion - - - To write StarDict, EPUB and DictionaryForMIDs glossaries, we need to load all entries into RAM to sort them - -- Other new features of Glossary class - - - `glos.getAuthor()` to get "author", or "publisher" (as fallback) - - `glos.removeHtmlTagsAll()` method, can be called by plugins' writer - - `glos.collectDefiFormat(maxCount)` extract defiFormat counts - - by reading first `maxCount` entries. (then iterator will be reset) - - Used in StarDict Writer - - Show memory usage in trace mode + - Drop support for read and write functions (outside a class) + - Now we only support Reader class and Writer class + - Reader class must have these methods + - `__init__(self, glos)` + - `open(self, filename)` + - Here glossary info must be read from file and set with `glos.setInfo` + - `__len__(self) -> int` + - Should return the number or entries, or zero if it's too costly + - `__iter__(self) -> "Iterator[BaseEntry]"` + - Can be a generator + - `close(self)` + - Writer class must have these methods + - `__init__(self, glos)` -- Bug fixes and improvements in code base + - `open(self, filename)` - - Apply entry filter when iterating over reader, fix [#251](https://github.com/ilius/pyglossary/issues/251) + - Here glossary info must be read from `glos.getInfo` or `glos.iterInfo` and written to file - - Fixes wrong sort order for some glossaries (converting to StarDict or other formats that need sort) + - `write(self) -> "Generator[None, BaseEntry, None]"` - - Fixes and improvements in `TextGlossaryReader` class + - Entries must be fetched with `entry = yield` in a `while True` loop: - - Fix ignoring glossary defaultDefiFormat + ```python + while True: + entry = yield + if entry is None: + break + # process and write entry into file(s) + ``` - - Fix evaluating `None` value in read/write options + - `finish(self)` -- Support reading multi-file Tabfile or other text formats - - - Example: `file.txt`, `file.txt.1`, `file.txt.2` - - Need to add `file_count` info key, for example: `##file_count 3` + - Read options and write options must be set to their default values as class attributes -- Fixes in Tabfile Writer + - See `pyglossary/plugins/csv_pyg.py` plugin for example - - Fix not escaping "" + - `sortKey` must be an instance method of Writer, instead of a function outside any class -- Add/update documentation + - Only for plugins that need sorting before write - - Update README.md - - Add Termux guides in `doc/termux.md` - - Move AppleDict guides to `doc/apple.md` - - Move LZO notes to `doc/lzo.md` - - Minify and compress `.svg` files in `doc/` folder +- Refactor and cleanup `Glossary` class -- Switch to f-strings, pep8 fixes, add types, style changes and refactoring + - Removed or replaced most of class/static attributes of `Glossary` + - To see the diff, run `git diff [3.3.0](./3.3.0.md)..master -- pyglossary/glossary.py` + - Removed `glos.addEntry` method + - If you use it in your program, replace with `glos.addEntryObj(glos.newEntry(word, defi, defiFormat))` + - Removed instance methods: + - `getMostUsedDefiFormats` + - `iterEntryBuckets` + - `zipOutDir` and `archiveOutDir` + - Moved to `pyglossary/glossary_utils.py` + - `archiveOutDir` renamed to `compressOutDir` + - `writeDict` + - `iterSqlLines` -> moved to `pyglossary/plugins/sql.py` + - `reverse`, `takeOutputWords`, `searchWordInDef` -> moved to `pyglossary/reverse.py` + - Values of `Glossary.plugins` is changed to `plugin_prop.PluginProp` instances + - Change `glos.writeTxt` arguments + - Replace `sep1` and `sep2` with `entryFmt` + - Replace `rplList` with `defiEscapeFunc`, `wordEscapeFunc` and `tail` + - Remove `iterEntries`, `entryFilterFunc` + - Method returns `Generator[None, BaseEntry, None]` instead of `bool` + - See for usage example: + - `pyglossary/glossary.py` -> `def writeTabfile` + - `pyglossary/plugins/dict_org_source.py` + - `pyglossary/plugins/json_plugin.py` + - `pyglossary/plugins/lingoes_ldf.py` + - `pyglossary/plugins/sdict_source.py` -- New command line flags: +- Refactor, cleanup and fixes in `Entry` and `DataEntry` classes - - `--log-time` to show datetime in logs (override `log_time` in config.json) - - `--no-alts` to disable alternates handling - - `--normalize-html` to lowercase tags (for now) - - `--cleanup` and `--no-cleanup` - - `--info` to save `.info` file alongside output file + - Replace `entry.getWord()` with `entry.word` + - Replace `entry.getWords()` with `entry.l_word` + - Replace `entry.getDefi()` with `entry.defi` + - Remove `entry.getDefis()` + - Drop handling alternate definitions in `Entry` objects + - Replace `entry.getDefiFormat()` with `entry.defiFormat` + - Add `entry.b_word` and `entry.b_defi` shortcuts that give `bytes` (UTF-8) + - Replace `dataEntry.getData()` with `dataEntry.data` + - Add `__slots__` to Entry and DataEntry classes + - Fix `DataEntry` in indirect mode + - Mistaken for Entry with defi=DATA, and file content discarded + - Save resource files in user's cache directory when loading input glossary into memory + - Move file to output glossary on `dataEntry.save(...)` + - Fix `Entry.getRawEntrySortKey` not being alternates-aware, broke StarDict Writer + - `DataEntry`: save: use `shutil.copy` if has `_tmpPath`, and set `_tmpPath` + +- New features of `Entry` + + - `entry.stripFullHtml()`, remove `......` + - Used in Kobo and Kobo Dictfile writers + - Add tests + +- Fix `glos.writeTabfile`: + + - Remove `\r` from definitions and info values + - Fix not escaping word + +- Fix/improve html detection in definitions + +- Switch to lazy imports of non-standard modules in plugins + +- Optimize RAM usage of indirect conversion + + - To write StarDict, EPUB and DictionaryForMIDs glossaries, we need to load all entries into RAM to sort them + +- Other new features of Glossary class + + - `glos.getAuthor()` to get "author", or "publisher" (as fallback) + - `glos.removeHtmlTagsAll()` method, can be called by plugins' writer + - `glos.collectDefiFormat(maxCount)` extract defiFormat counts + - by reading first `maxCount` entries. (then iterator will be reset) + - Used in StarDict Writer + - Show memory usage in trace mode + +- Bug fixes and improvements in code base + + - Apply entry filter when iterating over reader, fix [#251](https://github.com/ilius/pyglossary/issues/251) + + - Fixes wrong sort order for some glossaries (converting to StarDict or other formats that need sort) + + - Fixes and improvements in `TextGlossaryReader` class + + - Fix ignoring glossary defaultDefiFormat + + - Fix evaluating `None` value in read/write options + +- Support reading multi-file Tabfile or other text formats + + - Example: `file.txt`, `file.txt.1`, `file.txt.2` + - Need to add `file_count` info key, for example: `##file_count 3` + +- Fixes in Tabfile Writer + + - Fix not escaping "" + +- Add/update documentation + + - Update README.md + - Add Termux guides in `doc/termux.md` + - Move AppleDict guides to `doc/apple.md` + - Move LZO notes to `doc/lzo.md` + - Minify and compress `.svg` files in `doc/` folder + +- Switch to f-strings, pep8 fixes, add types, style changes and refactoring + +- New command line flags: + + - `--log-time` to show datetime in logs (override `log_time` in config.json) + - `--no-alts` to disable alternates handling + - `--normalize-html` to lowercase tags (for now) + - `--cleanup` and `--no-cleanup` + - `--info` to save `.info` file alongside output file diff --git a/doc/releases/4.1.0.md b/doc/releases/4.1.0.md index ba60f6216..996c0df44 100644 --- a/doc/releases/4.1.0.md +++ b/doc/releases/4.1.0.md @@ -1,302 +1,313 @@ -# Changes since [4.0.0](./4.0.0.md) +Changes since [4.0.0](./4.0.0.md) +================================= -There are a lot of changes since last release, but here is what I could gather and organize! -Please see the commit list for more! +There are a lot of changes since last release, but here is what I could gather and organize! Please see the commit list for more! -- Improvements in ui_gtk +- Improvements in ui_gtk -- Improvements in ui_tk +- Improvements in ui_tk -- Improvements in ui_cmd_interactive +- Improvements in ui_cmd_interactive -- Refactoring and improvements in ui-related codebase +- Refactoring and improvements in ui-related codebase -- Fix not loading config with `--ui=none` +- Fix not loading config with `--ui=none` -- Code style fixes and cleanup +- Code style fixes and cleanup -- Documentation +- Documentation - - Update most documentations. - - Add comments for read/write options. - - Generate documentation for all formats - - Placed in [doc/p](../p), linked to in `README.md` - - Generating with `scripts/plugin-doc-gen.py` script - - Read list of dictionary tools/applications from TOML files in [plugins-meta/tools](../../plugins-meta/tools) + - Update most documentations. + - Add comments for read/write options. + - Generate documentation for all formats + - Placed in [doc/p](../p), linked to in `README.md` + - Generating with `scripts/plugin-doc-gen.py` script + - Read list of dictionary tools/applications from TOML files in [plugins-meta/tools](../../plugins-meta/tools) -- Add `Dockerfile` and `run-with-docker.sh` script +- Add `Dockerfile` and `run-with-docker.sh` script -- New command-line flags: +- New command-line flags: - - `--json-read-options` and `--json-write-options` - - To allow using `;` in option values - - Example: `'--json-write-options={"delimiter": ";"}'` - - `--gtk`, `--tk` and `--cmd` as shortcut for `--ui=gtk` etc - - `--rtl` to change direction of definitions, [#268](https://github.com/ilius/pyglossary/issues/268), also added to `config.json` + - `--json-read-options` and `--json-write-options` + - To allow using `;` in option values + - Example: `'--json-write-options={"delimiter": ";"}'` + - `--gtk`, `--tk` and `--cmd` as shortcut for `--ui=gtk` etc + - `--rtl` to change direction of definitions, [#268](https://github.com/ilius/pyglossary/issues/268), also added to `config.json` -- Fix non-working `--remove-html` flag +- Fix non-working `--remove-html` flag -- Changes in `Glossary` class +- Changes in `Glossary` class - - Rename `glos.getPref` to `glos.getConfig` - - Change `formatsReadOptions` and `formatsWriteOptions` to `Dict[str, OrderedDict[str, Any]]` - - to include default values - - remove `glos.writeTabfile`, replace with a func in `pyglossary/text_writer.py` - - `Glossary.init`: avoid showing error if user plugin directory does not exist + - Rename `glos.getPref` to `glos.getConfig` + - Change `formatsReadOptions` and `formatsWriteOptions` to `Dict[str, OrderedDict[str, Any]]` + - to include default values + - remove `glos.writeTabfile`, replace with a func in `pyglossary/text_writer.py` + - `Glossary.init`: avoid showing error if user plugin directory does not exist -- Fixes and improvements code base +- Fixes and improvements code base - - Prevent `dataEntry.save()` from raising exception because of invalid filename or permission - - Avoid exception if removing temp file/folder failed - - Avoid `mktemp` and more improvements - - use `~/.cache/pyglossary/` directory instead of `/tmp/` - - Fixes and improvements in `runDictzip` - - Raise `RuntimeError` instead of `StopIteration` when iterating over a non-open reader - - Avoid exception if no zip command was found, fix [#294](https://github.com/ilius/pyglossary/issues/294) - - Remove directory after creating .zip, and some refactoring, [#294](https://github.com/ilius/pyglossary/issues/294) - - `DataEntry`: replace `inTmp` argument with `tmpPath` argument - - `Entry`: fix html pattern for hyperlinks, [#330](https://github.com/ilius/pyglossary/issues/330) - - Fix incorrect virtual env directory detection - - Refactor `dataDir` detection, [#307](https://github.com/ilius/pyglossary/issues/307) [#316](https://github.com/ilius/pyglossary/issues/316) - - Show warning if failed to create user plugins directory - - fix possible exception in `log.emit` - - Add support for Conda in `dataDir` detection, [#321](https://github.com/ilius/pyglossary/issues/321) - - Fix f-string in `StdLogHandler.emit` + - Prevent `dataEntry.save()` from raising exception because of invalid filename or permission + - Avoid exception if removing temp file/folder failed + - Avoid `mktemp` and more improvements + - use `~/.cache/pyglossary/` directory instead of `/tmp/` + - Fixes and improvements in `runDictzip` + - Raise `RuntimeError` instead of `StopIteration` when iterating over a non-open reader + - Avoid exception if no zip command was found, fix [#294](https://github.com/ilius/pyglossary/issues/294) + - Remove directory after creating .zip, and some refactoring, [#294](https://github.com/ilius/pyglossary/issues/294) + - `DataEntry`: replace `inTmp` argument with `tmpPath` argument + - `Entry`: fix html pattern for hyperlinks, [#330](https://github.com/ilius/pyglossary/issues/330) + - Fix incorrect virtual env directory detection + - Refactor `dataDir` detection, [#307](https://github.com/ilius/pyglossary/issues/307) [#316](https://github.com/ilius/pyglossary/issues/316) + - Show warning if failed to create user plugins directory + - fix possible exception in `log.emit` + - Add support for Conda in `dataDir` detection, [#321](https://github.com/ilius/pyglossary/issues/321) + - Fix f-string in `StdLogHandler.emit` -- Fixes and improvements in Windows +- Fixes and improvements in Windows - - Fix bad `dataDir` on Windows, [#307](https://github.com/ilius/pyglossary/issues/307) - - Fix `shutil.rmtree` exception on Windows - - Support creating .zip on Windows 10, [#294](https://github.com/ilius/pyglossary/issues/294) - - Check zip command before tar on Windows, [#294](https://github.com/ilius/pyglossary/issues/294) - - Show graphical error on exceptions on Windows - - Fix dataDir detection on Windows, [#323](https://github.com/ilius/pyglossary/issues/323) $324 + - Fix bad `dataDir` on Windows, [#307](https://github.com/ilius/pyglossary/issues/307) + - Fix `shutil.rmtree` exception on Windows + - Support creating .zip on Windows 10, [#294](https://github.com/ilius/pyglossary/issues/294) + - Check zip command before tar on Windows, [#294](https://github.com/ilius/pyglossary/issues/294) + - Show graphical error on exceptions on Windows + - Fix dataDir detection on Windows, [#323](https://github.com/ilius/pyglossary/issues/323) $324 -- Changes in Config: +- Changes in Config: - - Rename config key `skipResources` to `skip_resources` - - Add it to config.json and configDefDict - - Rename config key `utf8Check` to `utf8_check` - - User should edit ~/.pyglossary/config.json manually + - Rename config key `skipResources` to `skip_resources` + - Add it to config.json and configDefDict + - Rename config key `utf8Check` to `utf8_check` + - User should edit ~/.pyglossary/config.json manually -- Implement direct compression and uncompression, and some refactoring +- Implement direct compression and uncompression, and some refactoring - - change glos.detectInputFormat to return (filename, format, compression) or None - - remove Glossary.formatsReadFileObj and Glossary.formatsWriteFileObj - - remove `fileObj=` argument from `glos.writeTxt` - - use optional 'compressions' list/tuple from Writer or Reader classes for direct compression/uncompression - - refactoring in glossary_utils.py + - change glos.detectInputFormat to return (filename, format, compression) or None + - remove Glossary.formatsReadFileObj and Glossary.formatsWriteFileObj + - remove `fileObj=` argument from `glos.writeTxt` + - use optional 'compressions' list/tuple from Writer or Reader classes for direct compression/uncompression + - refactoring in glossary_utils.py -- Update `setup.py` +- Update `setup.py` -- Show version from 'git describe --always' on `--version` +- Show version from 'git describe --always' on `--version` -- `FileSize` option (used in many formats): +- `FileSize` option (used in many formats): - - Switch to metric (powers of 1000) for `K`, `M`, `G` units - - Add `KiB`, `MiB`, `GiB` for powers of 1024 + - Switch to metric (powers of 1000) for `K`, `M`, `G` units + - Add `KiB`, `MiB`, `GiB` for powers of 1024 -- Add `extensionCreate` variable (str) to plugins and plugin API +- Add `extensionCreate` variable (str) to plugins and plugin API - - Use it to improve ui_tk + - Use it to improve ui_tk -- Text-based glossary code-base (effecting Tabfile, Kobo Dictfile, LDF) +- Text-based glossary code-base (effecting Tabfile, Kobo Dictfile, LDF) - - Optimize TextGlossaryReader - - Change multi-file text glossary file names from `.N.txt` to `.txt.N` (where `N>=1`) - - Enable reading pyglossary-written multi-file text glossary by adding `file_count=-1` to metadata - - because the number of files is not known when creating the first txt file + - Optimize TextGlossaryReader + - Change multi-file text glossary file names from `.N.txt` to `.txt.N` (where `N>=1`\) + - Enable reading pyglossary-written multi-file text glossary by adding `file_count=-1` to metadata + - because the number of files is not known when creating the first txt file -- Tabfile +- Tabfile - - Rename option `writeInfo` to `enable_info` - - Reader: read resource files from `*.txt_res` directory if exists - - Add `*.txt_res` directory to \*.zip file + - Rename option `writeInfo` to `enable_info` + - Reader: read resource files from `*.txt_res` directory if exists + - Add `*.txt_res` directory to \*.zip file -- Zim Reader: +- Zim Reader: - - Migrate to libzim 1.0 - - Add mimetype `image/webp`, fix [#329](https://github.com/ilius/pyglossary/issues/329) + - Migrate to libzim 1.0 + - Add mimetype `image/webp`, fix [#329](https://github.com/ilius/pyglossary/issues/329) -- Slob and Tabfile Writer: add `file_size_approx` option to allow writing multi-part output +- Slob and Tabfile Writer: add `file_size_approx` option to allow writing multi-part output - - support values like: `5500k`, `100m`, `1.2g` + - support values like: `5500k`, `100m`, `1.2g` -- Add `word_title=False` option to some writers +- Add `word_title=False` option to some writers - - Slob Writer: add `word_title=False` option - - Tabfile Writer: add `word_title=False` option - - CSV Writer: add `word_title=False` option - - JSON Writer: add `word_title=False` option - - Dict.cc Reader: do not add word title - - FreeDict Reader: rename `keywords_header` option to `word_title` - - Add `glos.wordTitleStr`, used in plugins with `word_title` option - - Add `definition_has_headwords=True` info key to avoid adding the title next time we read the glossary + - Slob Writer: add `word_title=False` option + - Tabfile Writer: add `word_title=False` option + - CSV Writer: add `word_title=False` option + - JSON Writer: add `word_title=False` option + - Dict.cc Reader: do not add word title + - FreeDict Reader: rename `keywords_header` option to `word_title` + - Add `glos.wordTitleStr`, used in plugins with `word_title` option + - Add `definition_has_headwords=True` info key to avoid adding the title next time we read the glossary -- Aard2 (slob) +- Aard2 (slob) - - Writer: add option `separate_alternates=False`, [#270](https://github.com/ilius/pyglossary/issues/270) - - Writer: fix handling `content_type` option - - Writer: use `~/.cache/pyglossary/` instead of `/tmp` - - Writer: add mp3 to mime types, [#289](https://github.com/ilius/pyglossary/issues/289) - - Writer: add support for .ini data file, [#289](https://github.com/ilius/pyglossary/issues/289) - - Writer: support .webp files, [#329](https://github.com/ilius/pyglossary/issues/329) - - Writer: supoort .tiff and .tif files - - Reader: read glossary name/title and creation time from tags - - Reader: extract all metedata / tags - - `slob.py` library: Refactoring and cleanup + - Writer: add option `separate_alternates=False`, [#270](https://github.com/ilius/pyglossary/issues/270) + - Writer: fix handling `content_type` option + - Writer: use `~/.cache/pyglossary/` instead of `/tmp` + - Writer: add mp3 to mime types, [#289](https://github.com/ilius/pyglossary/issues/289) + - Writer: add support for .ini data file, [#289](https://github.com/ilius/pyglossary/issues/289) + - Writer: support .webp files, [#329](https://github.com/ilius/pyglossary/issues/329) + - Writer: supoort .tiff and .tif files + - Reader: read glossary name/title and creation time from tags + - Reader: extract all metedata / tags + - `slob.py` library: Refactoring and cleanup -- StarDict: +- StarDict: - - Reader: add option unicode_errors for invalid UTF-8 data, [#309](https://github.com/ilius/pyglossary/issues/309) - - Writer: add bool write-option `audio_goldendict`, [#327](https://github.com/ilius/pyglossary/issues/327) - - Writer: add option `audio_icon=True`, and add option comment, [#327](https://github.com/ilius/pyglossary/issues/327) + - Reader: add option unicode_errors for invalid UTF-8 data, [#309](https://github.com/ilius/pyglossary/issues/309) + - Writer: add bool write-option `audio_goldendict`, [#327](https://github.com/ilius/pyglossary/issues/327) + - Writer: add option `audio_icon=True`, and add option comment, [#327](https://github.com/ilius/pyglossary/issues/327) -- FreeDict Reader +- FreeDict Reader - - Fix two slashes before and after `pron` - - Avoid running `unescape_unicode` by `encoding="utf-8"` arg to `ET.htmlfile` - - Fix exception if `edition` is missing in header, and few other fixes - - Support `` with `` inside it - - Support `` inside nested second-level(nested) `` - - Add `"lang"` attribute to html elements - - Add option "example_padding" - - Fix rendering ``, refactoring and improvement - - Handle `` inside `` - - Support `` in `` - - Mark external refs with `` - - Support comment in `` - - Support `` inside `` - - Implement many tags under `` - - Improvements and refactoring + - Fix two slashes before and after `pron` + - Avoid running `unescape_unicode` by `encoding="utf-8"` arg to `ET.htmlfile` + - Fix exception if `edition` is missing in header, and few other fixes + - Support `` with `` inside it + - Support `` inside nested second-level(nested) `` + - Add `"lang"` attribute to html elements + - Add option "example_padding" + - Fix rendering ``, refactoring and improvement + - Handle `` inside `` + - Support `` in `` + - Mark external refs with `` + - Support comment in `` + - Support `` inside `` + - Implement many tags under `` + - Improvements and refactoring -- XDXF +- XDXF - - Fix not finding `xdxf.xsl` in installed mode + - Fix not finding `xdxf.xsl` in installed mode - - Effecting XDXF and StarDict formats + - Effecting XDXF and StarDict formats - - `xdxf.xsl`: generate `` instead of `` + - `xdxf.xsl`: generate `` instead of `` - - StarDict Reader: Add `xdxf_to_html=True` option, [#258](https://github.com/ilius/pyglossary/issues/258) + - StarDict Reader: Add `xdxf_to_html=True` option, [#258](https://github.com/ilius/pyglossary/issues/258) - - StarDict Reader: Import `xdxf_transform` lazily + - StarDict Reader: Import `xdxf_transform` lazily - - Remove forced dependency to `lxml`, [#261](https://github.com/ilius/pyglossary/issues/261) + - Remove forced dependency to `lxml`, [#261](https://github.com/ilius/pyglossary/issues/261) - - XDXF plugin: fix glos.setDefaultDefiFormat call + - XDXF plugin: fix glos.setDefaultDefiFormat call - * `xdxf_transform.py`: remove warnings for , [#322](https://github.com/ilius/pyglossary/issues/322) + - `xdxf_transform.py`: remove warnings for , [#322](https://github.com/ilius/pyglossary/issues/322) - - Merge PR [#317](https://github.com/ilius/pull/issues/317) - - Parse `sr`, `gr`, `ex_orig`, `ex_transl` tags and `audio` - - Remove `None` attribute from `audio` tag - - Use unicode symbols for audio and external link - - Use another speaker symbol for audio - - Add audio controls - - Use plain link without an audio tag + - Merge PR [#317](https://github.com/ilius/pull/issues/317) -- Mobi + - Parse `sr`, `gr`, `ex_orig`, `ex_transl` tags and `audio` - - Update ebook_mobi.py and README.md, [#299](https://github.com/ilius/pyglossary/issues/299) - - Add PR [#335](https://github.com/ilius/pyglossary/pull/335) with some modifications + - Remove `None` attribute from `audio` tag -- Changes in `ebook_base.py` (Mobi and EPUB) + - Use unicode symbols for audio and external link - - Avoid exception if removing tmpDir failed - - Use `style.css` dataEntry, [#299](https://github.com/ilius/pyglossary/issues/299) + - Use another speaker symbol for audio -- DSL Reader: + - Add audio controls - - Strip whitespaces around language names, [#264](https://github.com/ilius/pyglossary/issues/264) - - Add progressbar support, [#264](https://github.com/ilius/pyglossary/issues/264) - - Run `html.escape` on text before adding html tags, [#265](https://github.com/ilius/pyglossary/issues/265) - - Strip and unquote glossary name - - Generate `` and `` instead of `` - - Avoid adding html comment - - Remove `\ufeff` from header lines, [#306](https://github.com/ilius/pyglossary/issues/306) + - Use plain link without an audio tag -- AppleDict Source +- Mobi - - Change path of Dictionary Development Kit, [#300](https://github.com/ilius/pyglossary/issues/300) - - Open all text files with `encoding="utf-8"` - - Some refactporing + - Update ebook_mobi.py and README.md, [#299](https://github.com/ilius/pyglossary/issues/299) + - Add PR [#335](https://github.com/ilius/pyglossary/pull/335) with some modifications - * Rename 4 options: - - cleanHTML -> clean_html - - defaultPrefs -> default_prefs - - prefsHTML -> prefs_html - - frontBackMatter -> front_back_matter +- Changes in `ebook_base.py` (Mobi and EPUB) -- AppleDict Binary + - Avoid exception if removing tmpDir failed + - Use `style.css` dataEntry, [#299](https://github.com/ilius/pyglossary/issues/299) - - Improvements, [#299](https://github.com/ilius/pyglossary/issues/299) - - Read `DefaultStyle.css` file, add as `style.css`, [#299](https://github.com/ilius/pyglossary/issues/299) - - Change default value of option: `html=True` +- DSL Reader: -- Octopus MDict (MDX) + - Strip whitespaces around language names, [#264](https://github.com/ilius/pyglossary/issues/264) + - Add progressbar support, [#264](https://github.com/ilius/pyglossary/issues/264) + - Run `html.escape` on text before adding html tags, [#265](https://github.com/ilius/pyglossary/issues/265) + - Strip and unquote glossary name + - Generate `` and `` instead of `` + - Avoid adding html comment + - Remove `\ufeff` from header lines, [#306](https://github.com/ilius/pyglossary/issues/306) - - Fix image links - - Do not set empty title - - Minor improvement in `readmdict.py` - - Handle exception when reading from a corrupt MDD file - - Add bool flag same_dir_data_files, [#289](https://github.com/ilius/pyglossary/issues/289) - - Add read-option: `audio=True` (default: `False`), [#327](https://github.com/ilius/pyglossary/issues/327) - - `audio`: remove extra attrs and add comments +- AppleDict Source -- DICT.org plugin: + - Change path of Dictionary Development Kit, [#300](https://github.com/ilius/pyglossary/issues/300) + - Open all text files with `encoding="utf-8"` + - Some refactporing - - `installToDictd`: skip if target directory does not exist - - Make rendering dictd files a bit clear in pure txt - - Fix indentation issue and add bword prefix as url + - Rename 4 options: -- Fixes and improvements in Dict.cc (SQLite3) plugin: + - cleanHTML -> clean_html - - Fix typo, and avoid iterating over cur, use `fetchall()`, [#296](https://github.com/ilius/pyglossary/issues/296) - - Remove gender from headword, add it to definition, [#296](https://github.com/ilius/pyglossary/issues/296) - - Avoid running `unescape_unicode` + - defaultPrefs -> default_prefs -- JMDict + - prefsHTML -> prefs_html - - Support reading compressed file directly - - Show pos before gloss (translations) - - Avoid running `unescape_unicode` + - frontBackMatter -> front_back_matter -- DigitalNK: work around Python's sqlite bug, [#282](https://github.com/ilius/pyglossary/issues/282) +- AppleDict Binary -- Changes in `dict_org.py` plugin, By Justin Yang + - Improvements, [#299](https://github.com/ilius/pyglossary/issues/299) + - Read `DefaultStyle.css` file, add as `style.css`, [#299](https://github.com/ilius/pyglossary/issues/299) + - Change default value of option: `html=True` - - Use
to replace newline - - Replace words with {} around to true web link +- Octopus MDict (MDX) -- CC-CEDICT Reader: + - Fix image links + - Do not set empty title + - Minor improvement in `readmdict.py` + - Handle exception when reading from a corrupt MDD file + - Add bool flag same_dir_data_files, [#289](https://github.com/ilius/pyglossary/issues/289) + - Add read-option: `audio=True` (default: `False`), [#327](https://github.com/ilius/pyglossary/issues/327) + - `audio`: remove extra attrs and add comments - - Fix import error in `conv.py` - - Switch from jinja2 to lxml - - Fix not escaping `<`, `>` and `&` - - Note: lxml inserts `&[#160](https://github.com/ilius/pyglossary/issues/160);` instead of ` ` - - Use `` instead of `` - - add option to use Traditional Chinese for entry name +- DICT.org plugin: - * Avoid colorizing if tones count does not match `len(syllables)`, [#328](https://github.com/ilius/pyglossary/issues/328) - * Add `` for each syllable in case of mismatch tones, [#328](https://github.com/ilius/pyglossary/issues/328) + - `installToDictd`: skip if target directory does not exist + - Make rendering dictd files a bit clear in pure txt + - Fix indentation issue and add bword prefix as url -- Rename read/write options: +- Fixes and improvements in Dict.cc (SQLite3) plugin: - - DSL: rename option onlyFixMarkUp to only_fix_markup - - SQL: rename 2 options: - - `infoKeys` -> `info_keys` - - `addExtraInfo` -> `add_extra_info` - - EDLIN: rename option `havePrevLink` to `prev_link` - - CSV: rename option `writeInfo` to `enable_info` - - JSON: rename option `writeInfo` to `enable_info` - - BGL: rename all read/write options (to cameCase to snake_case) + - Fix typo, and avoid iterating over cur, use `fetchall()`, [#296](https://github.com/ilius/pyglossary/issues/296) + - Remove gender from headword, add it to definition, [#296](https://github.com/ilius/pyglossary/issues/296) + - Avoid running `unescape_unicode` -- New formats: +- JMDict - - Read "ABC Medical Notes (SQLite3)", `plugins/abc_medical_notes.py`, [#267](https://github.com/ilius/pyglossary/issues/267) - - Read "Almaany.com (SQLite3)", `plugins/almaany.py`, [#267](https://github.com/ilius/pyglossary/issues/267) [#268](https://github.com/ilius/pyglossary/issues/268) + - Support reading compressed file directly + - Show pos before gloss (translations) + - Avoid running `unescape_unicode` -- Remove TreeDict plugin, `plugins/treedict.py` +- DigitalNK: work around Python's sqlite bug, [#282](https://github.com/ilius/pyglossary/issues/282) -- Remove FreeDict writer +- Changes in `dict_org.py` plugin, By Justin Yang + + - Use
to replace newline + - Replace words with {} around to true web link + +- CC-CEDICT Reader: + + - Fix import error in `conv.py` + - Switch from jinja2 to lxml + - Fix not escaping `<`, `>` and `&` + - Note: lxml inserts `&[#160](https://github.com/ilius/pyglossary/issues/160);` instead of ` ` + - Use `` instead of `` + - add option to use Traditional Chinese for entry name + + - Avoid colorizing if tones count does not match `len(syllables)`, [#328](https://github.com/ilius/pyglossary/issues/328) + + - Add `` for each syllable in case of mismatch tones, [#328](https://github.com/ilius/pyglossary/issues/328) + +- Rename read/write options: + + - DSL: rename option onlyFixMarkUp to only_fix_markup + - SQL: rename 2 options: + - `infoKeys` -> `info_keys` + - `addExtraInfo` -> `add_extra_info` + - EDLIN: rename option `havePrevLink` to `prev_link` + - CSV: rename option `writeInfo` to `enable_info` + - JSON: rename option `writeInfo` to `enable_info` + - BGL: rename all read/write options (to cameCase to snake_case) + +- New formats: + + - Read "ABC Medical Notes (SQLite3)", `plugins/abc_medical_notes.py`, [#267](https://github.com/ilius/pyglossary/issues/267) + - Read "Almaany.com (SQLite3)", `plugins/almaany.py`, [#267](https://github.com/ilius/pyglossary/issues/267) [#268](https://github.com/ilius/pyglossary/issues/268) + +- Remove TreeDict plugin, `plugins/treedict.py` + +- Remove FreeDict writer diff --git a/doc/releases/4.2.0.md b/doc/releases/4.2.0.md index a00bb0bf4..d176b6b80 100644 --- a/doc/releases/4.2.0.md +++ b/doc/releases/4.2.0.md @@ -1,41 +1,45 @@ -# Changes since [4.1.0](./4.1.0.md) +Changes since [4.1.0](./4.1.0.md) +================================= -- Breaking changes: +- Breaking changes: - - Replace `glos.getAuthor()` with `glos.author` - - This looks for "author" and then "publisher" keys in info/metadata - - Rename option `apply_css` to `css` for mobi and epub2 - - `glos.getInfo` and `glos.setInfo` only accept `str` as key (or a subclass of `str`) + - Replace `glos.getAuthor()` with `glos.author` + - This looks for "author" and then "publisher" keys in info/metadata + - Rename option `apply_css` to `css` for mobi and epub2 + - `glos.getInfo` and `glos.setInfo` only accept `str` as key (or a subclass of `str`\) -- Bug fixes: +- Bug fixes: - - Indirect mode: Fix handling '|' character in words. + - Indirect mode: Fix handling '|' character in words. - - Escape/unescape `|` in words when converting `entry` \<-> `rawEntry` + - Escape/unescape `|` in words when converting `entry` \<-> `rawEntry` - - Escape/unescape `|` in words when writing/reading text-based file formats + - Escape/unescape `|` in words when writing/reading text-based file formats - - JSON: Prevent duplicate keys in json output, [#344](https://github.com/ilius/pyglossary/issues/344) + - JSON: Prevent duplicate keys in json output, [#344](https://github.com/ilius/pyglossary/issues/344) - - Add new method `glos.preventDuplicateWords()` + - Add new method `glos.preventDuplicateWords()` -- Features and improvements +- Features and improvements - - Add SQLite mode with `--sqlite` flag for converting to StarDict. + - Add SQLite mode with `--sqlite` flag for converting to StarDict. - - Eliminates the need to load all entries into RAM, limiting RAM usage. - - You can add `--sqlite` to you command, even for running GUI. - - For example: `python3 main.py --tk --sqlite` - - See [README.md](../../README.md#sqlite-mode) for more details. + - Eliminates the need to load all entries into RAM, limiting RAM usage. - - Add `--source-lang` and `--target-lang` flags + - You can add `--sqlite` to you command, even for running GUI. - - XDXF: support more tags and improvements + - For example: `python3 main.py --tk --sqlite` - - Add unit tests for `Glossary` class, and some functions in `text_utils.py` + - See [README.md](../../README.md#sqlite-mode) for more details. - - Windows: change cache directory to `%LOCALAPPDATA%` + - Add `--source-lang` and `--target-lang` flags - - Some refactoring and optimization + - XDXF: support more tags and improvements - - Update, improve and re-format documentations + - Add unit tests for `Glossary` class, and some functions in `text_utils.py` + + - Windows: change cache directory to `%LOCALAPPDATA%` + + - Some refactoring and optimization + + - Update, improve and re-format documentations diff --git a/doc/releases/4.2.1.md b/doc/releases/4.2.1.md index a013d6db3..8c7867914 100644 --- a/doc/releases/4.2.1.md +++ b/doc/releases/4.2.1.md @@ -1,30 +1,31 @@ -# Changes since version [4.2.0](./4.2.0.md) +Changes since version [4.2.0](./4.2.0.md) +========================================= ### Minor bug fixes and improvements: -- `text_utils.py` +- `text_utils.py` - - Minor bug: fix legacy function `urlToPath` using `urllib.parse.unquote` - - Minor bug: `replacePostSpaceChar`: remove trailing space from the output str - - Cleanup: - - Remove unused function `isControlChar` - - Remove unused function `formatByteStr` - - Remove argument `exclude` from function `isASCII` - - Add unit tests + - Minor bug: fix legacy function `urlToPath` using `urllib.parse.unquote` + - Minor bug: `replacePostSpaceChar`: remove trailing space from the output str + - Cleanup: + - Remove unused function `isControlChar` + - Remove unused function `formatByteStr` + - Remove argument `exclude` from function `isASCII` + - Add unit tests -- `ui_cmd_interactive.py`: fix a minor bug and some small refactoring +- `ui_cmd_interactive.py`: fix a minor bug and some small refactoring -- Command line: Override input glossary info with `--source-lang` and `--target-lang` flags +- Command line: Override input glossary info with `--source-lang` and `--target-lang` flags -- Add unit tests for CSV -> Tabfile conversion +- Add unit tests for CSV -> Tabfile conversion -- CSV plugin: some refactoring, and rename the module to `csv_plugin.py` +- CSV plugin: some refactoring, and rename the module to `csv_plugin.py` -- Update `setup.py`: add `python_requires=">=3.7.0"`, update `extras_require` +- Update `setup.py`: add `python_requires=">=3.7.0"`, update `extras_require` -- Update README.md +- Update README.md ### Fearures: -- Command line: Add `--name` flag for changing glossary name -- `Glossary`: `convert`: add `infoOverride` optional argument +- Command line: Add `--name` flag for changing glossary name +- `Glossary`: `convert`: add `infoOverride` optional argument diff --git a/doc/releases/4.3.0.md b/doc/releases/4.3.0.md index 25ba152a2..acd663d47 100644 --- a/doc/releases/4.3.0.md +++ b/doc/releases/4.3.0.md @@ -1,52 +1,58 @@ -# Changes since [4.2.1](./4.2.1.md) +Changes since [4.2.1](./4.2.1.md) +================================= -## Bug fixes +Bug fixes +--------- -- Tabfile writer: fix replacing `\` with `\\` -- `--remove-html` flag: fix bad regex -- ui_cmd_interactive: fix a few bugs -- Lowercase word/entry links (`
`entry`. -This fixes some very edge cases involving `|` in words, but uses more RAM in indirect mode (converting to StarDict), which can be solved with `--sqlite`. +Change `rawEntry[0]` from `bytes` to `List[str]` and avoid split/join when converting `rawEntry` \<-> `entry`. This fixes some very edge cases involving `|` in words, but uses more RAM in indirect mode (converting to StarDict), which can be solved with `--sqlite`. -## Documentation +Documentation +------------- -- Replace `doc/config.md` with [doc/config.rst](../config.rst), update comments and other improvements -- Generate [doc/entry-filters.md](../entry-filters.md) -- Update plugins doc -- Update README.md +- Replace `doc/config.md` with [doc/config.rst](../config.rst), update comments and other improvements +- Generate [doc/entry-filters.md](../entry-filters.md) +- Update plugins doc +- Update README.md -## Unit testing +Unit testing +------------ **Coverage of `glossary.py`: %75** @@ -54,46 +60,48 @@ There are 2501 lines of test code in [tests](../../tests) directory. Tests for Glossary class include: -- Basic functionality -- Error handling -- Sorting and direct / indirect / SQLite modes -- Entry filter config/flags (`lower`, `rtl`, `remove_html`, `remove_html_all`) -- Resources / data entries -- Convert: Tabfile \<-> Aard2 slob -- Convert: Tabfile \<-> CSV -- Convert: Tabfile -> EPUB-2 -- Convert: Tabfile -> JSON -- Convert: Tabfile \<-> StarDict +- Basic functionality +- Error handling +- Sorting and direct / indirect / SQLite modes +- Entry filter config/flags (`lower`, `rtl`, `remove_html`, `remove_html_all`\) +- Resources / data entries +- Convert: Tabfile \<-> Aard2 slob +- Convert: Tabfile \<-> CSV +- Convert: Tabfile -> EPUB-2 +- Convert: Tabfile -> JSON +- Convert: Tabfile \<-> StarDict Other improvements: -- `glossary_test.py`: check CRC32 of downloaded test files -- `glossary_test.py`: use a new temp dir for each test method for isolation. -- `ebook_kobo_test.py`: split into several test methods - -## Improvements - -- Zim: make improvements, [#352](https://github.com/ilius/pyglossary/issues/352) -- Aard2 slob: add 2 mime types, [#352](https://github.com/ilius/pyglossary/issues/352) -- ui/main.py: do not allow --remove-html and --remove-html-all together -- Glossary: do not allow `glos.config` to be set twice -- Glossary: change some error logs to critical, and more improvements -- Prevent conflicting config flags together, like `--lower --no-lower` -- Disable `utf8_check` config parameter by default (not needed since `3.0.0`) - -## Refactoring and cleanup - -- Glossary: some refactoring in convert method -- Rename 3 scripts in `scripts/` directory -- Remove `DataEntry.fromFile` and improve behavior of `DataEntry.__init__` -- Refactoring in ui/ -- rename `option.cmdFlag` to `option.customFlag` -- Glossary: add `glos.rawEntryCompress` property, and use in `entry.py` -- Glossary: minor improvement in loadPlugins -- XDXF: remove useless argument in `Reader.open` -- remove unused some functions from `text_utils.py` -- `plugin_prop.py`: refactor getExtraOptions -- Avoid assigning protected attrs in `text_writer.py` and `plugins/tabfile.py` -- Fewer protected attr access in `entry_filters.py` -- Move `sortKey` and `get_prefix` implementations from `ebook_base.py` to epub and mobi plugins -- Change name of 2 entry filters to match the config param +- `glossary_test.py`: check CRC32 of downloaded test files +- `glossary_test.py`: use a new temp dir for each test method for isolation. +- `ebook_kobo_test.py`: split into several test methods + +Improvements +------------ + +- Zim: make improvements, [#352](https://github.com/ilius/pyglossary/issues/352) +- Aard2 slob: add 2 mime types, [#352](https://github.com/ilius/pyglossary/issues/352) +- ui/main.py: do not allow --remove-html and --remove-html-all together +- Glossary: do not allow `glos.config` to be set twice +- Glossary: change some error logs to critical, and more improvements +- Prevent conflicting config flags together, like `--lower --no-lower` +- Disable `utf8_check` config parameter by default (not needed since `3.0.0`\) + +Refactoring and cleanup +----------------------- + +- Glossary: some refactoring in convert method +- Rename 3 scripts in `scripts/` directory +- Remove `DataEntry.fromFile` and improve behavior of `DataEntry.__init__` +- Refactoring in ui/ +- rename `option.cmdFlag` to `option.customFlag` +- Glossary: add `glos.rawEntryCompress` property, and use in `entry.py` +- Glossary: minor improvement in loadPlugins +- XDXF: remove useless argument in `Reader.open` +- remove unused some functions from `text_utils.py` +- `plugin_prop.py`: refactor getExtraOptions +- Avoid assigning protected attrs in `text_writer.py` and `plugins/tabfile.py` +- Fewer protected attr access in `entry_filters.py` +- Move `sortKey` and `get_prefix` implementations from `ebook_base.py` to epub and mobi plugins +- Change name of 2 entry filters to match the config param diff --git a/doc/releases/4.4.0.md b/doc/releases/4.4.0.md index 8403ea164..8eb03b40e 100644 --- a/doc/releases/4.4.0.md +++ b/doc/releases/4.4.0.md @@ -1,85 +1,96 @@ -# Changes since [4.3.0](./4.3.0.md) +Changes since [4.3.0](./4.3.0.md) +================================= -## Breaking changes +Breaking changes +---------------- -- Remove partial sorting support (obsolete feature) +- Remove partial sorting support (obsolete feature) - - Remove `--sort-cache-size` flag in command line - - (For library users) Remove `sortCacheSize` argument to `glos.write` and `glos.convert` + - Remove `--sort-cache-size` flag in command line + - (For library users) Remove `sortCacheSize` argument to `glos.write` and `glos.convert` -- Re-design sorting and `sortKey` parameters +- Re-design sorting and `sortKey` parameters - - Breaking change for library users, and user plugins that need sorting (`sortOnWrite = ALWAYS`) + - Breaking change for library users, and user plugins that need sorting (`sortOnWrite = ALWAYS`\) - - Change `glos.convert` + - Change `glos.convert` - - Replace argument `sortKey` (Callable) with `sortKeyName` (`str`) - - Add argument `sortEncoding` (str) defaulting to `utf-8` + - Replace argument `sortKey` (Callable) with `sortKeyName` (`str`\) - - Change `glos.write` + - Add argument `sortEncoding` (str) defaulting to `utf-8` - - Replace argument `sortKey` (Callable) with `namedSortKey` (`sort_keys.NamedSortKey`) - - Add argument `sortEncoding` (`str`) defaulting to `utf-8` + - Change `glos.write` - - Change `glos.sortWords` + - Replace argument `sortKey` (Callable) with `namedSortKey` (`sort_keys.NamedSortKey`\) - - Replace argument `key` (Callable) with `sortKeyName` (`str`) - - Add argument `sortEncoding` (`str`) defaulting to `utf-8` + - Add argument `sortEncoding` (`str`) defaulting to `utf-8` - - Change API of plugins that use `sortOnWrite = ALWAYS` + - Change `glos.sortWords` - - Replace `writer.sortKey` and `Writer.sqliteSortKey` with `sortKeyName` in plugin module. - - See the [stardict.py](https://github.com/ilius/pyglossary/blob/86eb03d/pyglossary/plugins/stardict.py#L30) for example. + - Replace argument `key` (Callable) with `sortKeyName` (`str`\) - **Note 1**: All `sortKey` and `sortEncoding` arguments are optional. + - Add argument `sortEncoding` (`str`) defaulting to `utf-8` - **Note 2**: Values of `sortKeyName` are documented in [doc/sort-key.md](../sort-key.md) + - Change API of plugins that use `sortOnWrite = ALWAYS` -- Rename 2 files in `doc/`: + - Replace `writer.sortKey` and `Writer.sqliteSortKey` with `sortKeyName` in plugin module. - - Rename `doc/entry_filters.md` to `doc/entry-filters.md` - - Rename `doc/term_colors.md` to `doc/term-colors.md` + - See the [stardict.py](https://github.com/ilius/pyglossary/blob/86eb03d/pyglossary/plugins/stardict.py#L30) for example. -## Features +**Note 1**: All `sortKey` and `sortEncoding` arguments are optional. -- `--sort-key` and `--sort-encoding` command line flags (as part of above re-design) +**Note 2**: Values of `sortKeyName` are documented in [doc/sort-key.md](../sort-key.md) - - See [README.md](../../README.md#sorting) and [doc/sort-key.md](../sort-key.md). +- Rename 2 files in `doc/`: -- **Now SQLite mode works for all output formats.** + - Rename `doc/entry_filters.md` to `doc/entry-filters.md` + - Rename `doc/term_colors.md` to `doc/term-colors.md` -## Bug fixes +Features +-------- -- Fix lack of Progress Bar while writing in indirect or SQLite mode -- Fix misleading message log about SQLite mode -- Fix unclosed files in XDXF and FreeDict plugins +- `--sort-key` and `--sort-encoding` command line flags (as part of above re-design) -## Improvements + - See [README.md](../../README.md#sorting) and [doc/sort-key.md](../sort-key.md). -- Show a 1-line log instead of `FileNotFoundError` traceback in `glos.read` and `glos.write` -- Close readers in `glos.convert` if `write` failed -- Fix some type annotations and comments -- (For library users) Change `Glossary.__str__` -- (For library users) `glos.setInfo`: convert non-str value to str, and add tests +- **Now SQLite mode works for all output formats.** -## Unit testing +Bug fixes +--------- + +- Fix lack of Progress Bar while writing in indirect or SQLite mode +- Fix misleading message log about SQLite mode +- Fix unclosed files in XDXF and FreeDict plugins + +Improvements +------------ + +- Show a 1-line log instead of `FileNotFoundError` traceback in `glos.read` and `glos.write` +- Close readers in `glos.convert` if `write` failed +- Fix some type annotations and comments +- (For library users) Change `Glossary.__str__` +- (For library users) `glos.setInfo`: convert non-str value to str, and add tests + +Unit testing +------------ Add new tests and improve existing tests. -- Coverage of `glossary.py`: %89 -- Overall coverage of codebase + plugins: %58 - -## Refactoring and design improvements - -- Simplify by passing `glos` object to `EntryList()` -- Replace `SqList` with `SqEntryList` -- Change `__iter__` of `SqEntryList` and `EntryList` to give entry objects -- Simplify `Glossary` by moving `gc.collect` to `EntryList` and `SqEntryList` -- Remove unused function `xml_unescape` -- Remove unused import from FreeDict and JMDict plugins -- Use `operator.itemgetter` in `stardict.py`, `dict_cc.py`, `ebook_kobo.py`, `reverse.py` -- `glossary.py`: cleanup, simplify and optimize generators logic - - Also remove `index` argument from `entryFilter.run` method and add some comments -- Remove redundant check in `glos.progress` -- Remove redundant check in `_getLangByStr` -- Remove redundant check in `Glossary.detectOutputFormat` +- Coverage of `glossary.py`: %89 +- Overall coverage of codebase + plugins: %58 + +Refactoring and design improvements +----------------------------------- + +- Simplify by passing `glos` object to `EntryList()` +- Replace `SqList` with `SqEntryList` +- Change `__iter__` of `SqEntryList` and `EntryList` to give entry objects +- Simplify `Glossary` by moving `gc.collect` to `EntryList` and `SqEntryList` +- Remove unused function `xml_unescape` +- Remove unused import from FreeDict and JMDict plugins +- Use `operator.itemgetter` in `stardict.py`, `dict_cc.py`, `ebook_kobo.py`, `reverse.py` +- `glossary.py`: cleanup, simplify and optimize generators logic + - Also remove `index` argument from `entryFilter.run` method and add some comments +- Remove redundant check in `glos.progress` +- Remove redundant check in `_getLangByStr` +- Remove redundant check in `Glossary.detectOutputFormat` diff --git a/doc/releases/4.4.1.md b/doc/releases/4.4.1.md index 83fb97781..c94977556 100644 --- a/doc/releases/4.4.1.md +++ b/doc/releases/4.4.1.md @@ -1,17 +1,21 @@ -# Changes since [4.4.0](./4.4.0.md) +Changes since [4.4.0](./4.4.0.md) +================================= -## Bug fixes +Bug fixes +--------- -- Automatically create `cacheDir` on `Glossary.init()` - - Fixes exception in SQLite mode +- Automatically create `cacheDir` on `Glossary.init()` + - Fixes exception in SQLite mode -## Features +Features +-------- -- `ui_cmd_interactive`: support setting `sortKey` +- `ui_cmd_interactive`: support setting `sortKey` -## Improvements and documentation +Improvements and documentation +------------------------------ -- Wiktionary Dump: remove detect-by-extension -- `glossary.py`: update docstrings for `sortKeyName` -- `sort_keys.py`: add `desc` to `NamedSortKey` -- Update `doc/sort-key.md` +- Wiktionary Dump: remove detect-by-extension +- `glossary.py`: update docstrings for `sortKeyName` +- `sort_keys.py`: add `desc` to `NamedSortKey` +- Update `doc/sort-key.md` diff --git a/doc/releases/4.5.0.md b/doc/releases/4.5.0.md index 41ba73401..2f84a3a03 100644 --- a/doc/releases/4.5.0.md +++ b/doc/releases/4.5.0.md @@ -1,96 +1,103 @@ -# Changes since [4.4.1](./4.4.1.md) +Changes since [4.4.1](./4.4.1.md) +================================= -## Bug fixes +Bug fixes +--------- -- Fix 2 log messages in `glos._resolveConvertSortParams` +- Fix 2 log messages in `glos._resolveConvertSortParams` -- Fixes and improvements in Dictfile (.df) reader +- Fixes and improvements in Dictfile (.df) reader - - Fix exception: disable loading info (Dicfile does not support info) - - TextGlossaryReader: prevent producing duplicate data entries - - This fixes: `error in DataEntry.save: [Errno 2] No such file or directory: ...` because `entry.save()` moves the temp file to output path - - This bug only existed for Dictfile (.df) format. - - Remove extra colon, #358 - - Remove some extra newline - - And add test for Dictfile to/from Tabfile + - Fix exception: disable loading info (Dicfile does not support info) + - TextGlossaryReader: prevent producing duplicate data entries + - This fixes: `error in DataEntry.save: [Errno 2] No such file or directory: ...` because `entry.save()` moves the temp file to output path + - This bug only existed for Dictfile (.df) format. + - Remove extra colon, #358 + - Remove some extra newline + - And add test for Dictfile to/from Tabfile -- Fix not cleaning up temp directory on return with error from `glos.convert` +- Fix not cleaning up temp directory on return with error from `glos.convert` -## Features +Features +-------- -- ui_gtk: add a "General Options" button that opens a dialog for: +- ui_gtk: add a "General Options" button that opens a dialog for: - - Settings for `sort` and `sortKey` - - Checkbox for SQLite mode - - Check boxes for config params: `save_info_json`, `lower`, `skip_resources`, `rtl`, `enable_alts`, `cleanup`, `remove_html_all` + - Settings for `sort` and `sortKey` + - Checkbox for SQLite mode + - Check boxes for config params: `save_info_json`, `lower`, `skip_resources`, `rtl`, `enable_alts`, `cleanup`, `remove_html_all` -- Add support for `--sort-key random` to shuffle entries +- Add support for `--sort-key random` to shuffle entries -## Performance improvements +Performance improvements +------------------------ -- Performance improvement: remove `gc.collect()` calls in `Glossary` and `*EntryList` +- Performance improvement: remove `gc.collect()` calls in `Glossary` and `*EntryList` - - Not needed since Python 3.8 - - Change minimum python requirement to 3.8 in `README.md` + - Not needed since Python 3.8 + - Change minimum python requirement to 3.8 in `README.md` -- Do not import all plugin modules (only import two plugins that are used) +- Do not import all plugin modules (only import two plugins that are used) - - Load json file `plugins-meta/index.json` instead - - In debug mode, all plugin modules are still imported and validated - - User plugins are still imported + - Load json file `plugins-meta/index.json` instead + - In debug mode, all plugin modules are still imported and validated + - User plugins are still imported -## Other improvements +Other improvements +------------------ -- Improve detection of languages from glossary name, and add tests -- Update `langs.json`: add new 3-letter codes for 25 languages -- `glos.preventDuplicateWords` and `glos.removeHtmlTagsAll`: prevent adding filter twice -- `glos.cleanup`: reset path list to avoid (non-critical) error if called again -- Minor improvements in `Glossary.init()` -- `DataEntry.save`: on `FileNotFoundError` show a 1-line error instead of `log.exception` -- ui_gtk: create a new `Glossary` object every time Convert button is clicked -- Add docstring for `Glossary.init` +- Improve detection of languages from glossary name, and add tests +- Update `langs.json`: add new 3-letter codes for 25 languages +- `glos.preventDuplicateWords` and `glos.removeHtmlTagsAll`: prevent adding filter twice +- `glos.cleanup`: reset path list to avoid (non-critical) error if called again +- Minor improvements in `Glossary.init()` +- `DataEntry.save`: on `FileNotFoundError` show a 1-line error instead of `log.exception` +- ui_gtk: create a new `Glossary` object every time Convert button is clicked +- Add docstring for `Glossary.init` -## Unit testing +Unit testing +------------ -- Update `tests/glossary_errors_test.py` -- Add missing cleanup for some temp file -- add test for LDF to/from Tabfile +- Update `tests/glossary_errors_test.py` +- Add missing cleanup for some temp file +- add test for LDF to/from Tabfile -# Refactoring +Refactoring +=========== -- Plugins: replace import of `formats_common` from current directory with `pyglossary.plugins.formats_common` +- Plugins: replace import of `formats_common` from current directory with `pyglossary.plugins.formats_common` -- Fix `logging.warn` method is deprecated, use `warning` instead, PR #360 by @BoboTiG +- Fix `logging.warn` method is deprecated, use `warning` instead, PR #360 by @BoboTiG -- Fix `DeprecationWarning: invalid escape sequence`, PR #361 by @BoboTiG +- Fix `DeprecationWarning: invalid escape sequence`, PR #361 by @BoboTiG -- Move some functions from `glossary_utils.py` to `compression.py` +- Move some functions from `glossary_utils.py` to `compression.py` -- Move some methods from `Glossary` to new parent classes `PluginManager` and `GlossaryInfo` +- Move some methods from `Glossary` to new parent classes `PluginManager` and `GlossaryInfo` -- Some refactoring in `plugin_prop.py` and `plugin_manager.py` +- Some refactoring in `plugin_prop.py` and `plugin_manager.py` - - Rename `plugin.pluginModule` to `plugin.module` - - Minimize direct access to `plugin.module`, `plugin.readerClass` or `plugin.writerClass` - - Add some new properties to `PluginProp` - - Remove a log from `glossary.py` - - Disable validation of plugins unless in debug mode - - `plugin_prop.py`: fix checking debug level + - Rename `plugin.pluginModule` to `plugin.module` + - Minimize direct access to `plugin.module`, `plugin.readerClass` or `plugin.writerClass` + - Add some new properties to `PluginProp` + - Remove a log from `glossary.py` + - Disable validation of plugins unless in debug mode + - `plugin_prop.py`: fix checking debug level -- `sq_entry_list.py`: rename `sortColumns` to `sqliteSortKey` +- `sq_entry_list.py`: rename `sortColumns` to `sqliteSortKey` -- Some refactoring around `setSortKey` between `Glossary`, `EntryList` and `SqEntryList` +- Some refactoring around `setSortKey` between `Glossary`, `EntryList` and `SqEntryList` -- Remove `Entry.sqliteSortKeyFrom` and related classmethods +- Remove `Entry.sqliteSortKeyFrom` and related classmethods -- Some more simplification in `glossary.py` +- Some more simplification in `glossary.py` -- Remove `Entry.defaultSortKey` +- Remove `Entry.defaultSortKey` -- Some style fixes +- Some style fixes -- `iter_utils.py`: remove unused `key=` argument from `unique_everseen` +- `iter_utils.py`: remove unused `key=` argument from `unique_everseen` -- Refactor ui_gtk and update config comments +- Refactor ui_gtk and update config comments -- `extractInlineHtmlImages`: avoid writing file within sub func +- `extractInlineHtmlImages`: avoid writing file within sub func diff --git a/doc/releases/4.6.0.md b/doc/releases/4.6.0.md index f49b76b46..3443c920d 100644 --- a/doc/releases/4.6.0.md +++ b/doc/releases/4.6.0.md @@ -1,337 +1,351 @@ -# Changes since [4.5.0](./4.5.0.md) +Changes since [4.5.0](./4.5.0.md) +================================= -## Dependency change +Dependency change +----------------- We now require Python 3.9 or a later version. -## Bug fixes +Bug fixes +--------- -- Fix exception in `scripts/plugin-index.py`: 8a94b8c60cce50a21e229020970f085a0fb55fb0 +- Fix exception in `scripts/plugin-index.py`: 8a94b8c60cce50a21e229020970f085a0fb55fb0 -- StarDict: Fix writing to `.zip` file produced empty zip, and fix bad test +- StarDict: Fix writing to `.zip` file produced empty zip, and fix bad test -- dictunformat: fix #367: add option `headword_separator`, default to `; ` +- dictunformat: fix #367: add option `headword_separator`, default to `;` -- Fixes in ui_gtk, #380 #382 #403 +- Fixes in ui_gtk, #380 #382 #403 -- AppleDict source: fix #407 missing quotes for title, and refactor duplicate codes +- AppleDict source: fix #407 missing quotes for title, and refactor duplicate codes -- DictionaryForMIDs: remove `|` from word when normalizing, fix punctuation regex, use Unix newlines +- DictionaryForMIDs: remove `|` from word when normalizing, fix punctuation regex, use Unix newlines -- StarDict: use Unix newline when reading and writing .ifo file on Windows +- StarDict: use Unix newline when reading and writing .ifo file on Windows -- Fix bug of `glos.addEntryObj(dataEntry)` adding empty file because `tmpDataDir` is not set until `glos.read()` +- Fix bug of `glos.addEntryObj(dataEntry)` adding empty file because `tmpDataDir` is not set until `glos.read()` - - Set and create `tmpDataDir` on `glos.tmpDataDir` access, and add test, #424 + - Set and create `tmpDataDir` on `glos.tmpDataDir` access, and add test, #424 -- Fix `scripts/wiki-formats.py`, #428 +- Fix `scripts/wiki-formats.py`, #428 -- Dictd / Dict.org: fix exception on Windows +- Dictd / Dict.org: fix exception on Windows -## Features +Features +-------- -- Support sorting by an ICU locale, see [Sorting section of README](../../README.md#sorting) +- Support sorting by an ICU locale, see [Sorting section of README](../../README.md#sorting) -- Add Gtk4 interface `--ui=gtk4` / `--gtk4` +- Add Gtk4 interface `--ui=gtk4` / `--gtk4` - - still buggy and not as functional as Gtk3 or Tkinter interfaces + - still buggy and not as functional as Gtk3 or Tkinter interfaces -- Add flag `--optimize-memory`, config key `optimize_memory` +- Add flag `--optimize-memory`, config key `optimize_memory` - - To enable entry compression on `--indirect` - - Not enabled by default (it was previously always compressed) + - To enable entry compression on `--indirect` + - Not enabled by default (it was previously always compressed) -- Allow plugin's `reader.open()` to return an `Iterator` for progress bar +- Allow plugin's `reader.open()` to return an `Iterator` for progress bar - - Implement for Tabfile (reading info/metedata) - - Implement for AppleDict Binary (reading `KeyText.data`) + - Implement for Tabfile (reading info/metedata) + - Implement for AppleDict Binary (reading `KeyText.data`\) -- Add read and write support for StarDict Textual File (.xml), #348 +- Add read and write support for StarDict Textual File (.xml), #348 -- Add support for writing Yomichan dictionary files, #395 by @tomtung +- Add support for writing Yomichan dictionary files, #395 by @tomtung -- StarDict reader: support `.syn.dz` file, #410 +- StarDict reader: support `.syn.dz` file, #410 -- StarDict writer: add write option `large_file`, #392 #422 +- StarDict writer: add write option `large_file`, #392 #422 -- StarDict reader: support `dxoffsetbits=64` on read, #392 #422 +- StarDict reader: support `dxoffsetbits=64` on read, #392 #422 -- JMDict: support examples, #383 +- JMDict: support examples, #383 -- Add read support for JMnedict, #386 +- Add read support for JMnedict, #386 -- Add flag `--skip-duplicate-headword`, config `skip_duplicate_headword`, #365 +- Add flag `--skip-duplicate-headword`, config `skip_duplicate_headword`, #365 - - Zim reader: remove option `skip_duplicate_words`, #365 + - Zim reader: remove option `skip_duplicate_words`, #365 -- Add flag `--trim-arabic-diacritics`, config `trim_arabic_diacritics`, #366 +- Add flag `--trim-arabic-diacritics`, config `trim_arabic_diacritics`, #366 -- Add read support for IUPAC goldbook (.xml), #355 +- Add read support for IUPAC goldbook (.xml), #355 -- Add write support for [DIKT](https://github.com/maxim-saplin/dikt) JSON +- Add write support for [DIKT](https://github.com/maxim-saplin/dikt) JSON -- StarDict writer: limit memory usage by using SQLite for `idx` and `syn` data, #409 +- StarDict writer: limit memory usage by using SQLite for `idx` and `syn` data, #409 -- CSV: add newline option, defaulting to Unix-style +- CSV: add newline option, defaulting to Unix-style -- Aard2 Slob writer: add option `file_size_approx_check_num_entries` +- Aard2 Slob writer: add option `file_size_approx_check_num_entries` -- Add `scripts/diff-glossary` and `scripts/view-glossary` +- Add `scripts/diff-glossary` and `scripts/view-glossary` -## Improvements +Improvements +------------ -- When remove HTML tags, also replace `
` with `\n`, #394 by @tomtung +- When remove HTML tags, also replace `
` with `\n`, #394 by @tomtung - - Treat `
` the same way `

` is treated. + - Treat `

` the same way `

` is treated. -- Mobi: add `mobi7-forcing` switch to `kindlegen` command, #374 by @holyspiritomb +- Mobi: add `mobi7-forcing` switch to `kindlegen` command, #374 by @holyspiritomb -- Octopus MDict: ignore directories with `same_dir_data_files`, #362 +- Octopus MDict: ignore directories with `same_dir_data_files`, #362 -- StarDict reader: handle definitions with mixed types/formats +- StarDict reader: handle definitions with mixed types/formats -- Dictfile: strip whitespaces from word and defi before going through entry filters +- Dictfile: strip whitespaces from word and defi before going through entry filters -- BGL: strip whitespaces from word and defi before going through entry filters +- BGL: strip whitespaces from word and defi before going through entry filters -- Improvement in `glos.write`: avoid printing exception for invalid encoding +- Improvement in `glos.write`: avoid printing exception for invalid encoding -- Remove empty logs in `glos.convert` +- Remove empty logs in `glos.convert` -- StarDict reader: fix validating `sametypesequence`, and add test +- StarDict reader: fix validating `sametypesequence`, and add test -- `glos.convert`: Allow an existing empty directory as output path +- `glos.convert`: Allow an existing empty directory as output path -- `TextGlossaryReader`: replace `nextPair` method with `nextBlock` which returns resource files as third item +- `TextGlossaryReader`: replace `nextPair` method with `nextBlock` which returns resource files as third item -- ui_cmd_interactive: allow converting several times before exiting +- ui_cmd_interactive: allow converting several times before exiting -- Change title tag for Greek from `` to `` +- Change title tag for Greek from `` to `` -- Update language data set (`langs.json`) +- Update language data set (`langs.json`\) -- `ui/main.py`: print 1-line error instead of full exception on `ImportError` +- `ui/main.py`: print 1-line error instead of full exception on `ImportError` -- `ui/main.py`: Windows: try Tkinter before Gtk +- `ui/main.py`: Windows: try Tkinter before Gtk -- `ebook_base.py`: avoid `shutil.move` on Windows, #368 +- `ebook_base.py`: avoid `shutil.move` on Windows, #368 -- `TextGlossaryReader`: fix loading info and some refactoring, #370 36b9cd83d4c79b32e34bf64c3101cb89093b2a4e +- `TextGlossaryReader`: fix loading info and some refactoring, #370 36b9cd83d4c79b32e34bf64c3101cb89093b2a4e -- `Entry`: Allow `word` to be `tuple` in `Entry(word=...)` +- `Entry`: Allow `word` to be `tuple` in `Entry(word=...)` -- `glos.iterInfo()` return `Iterator` rather than `Iterable` +- `glos.iterInfo()` return `Iterator` rather than `Iterable` -- Zim: change dependency to `libzim>=1.0`, and some comments +- Zim: change dependency to `libzim>=1.0`, and some comments -- Mobi: work with kindlegen executable in `PATH` directories, #401 +- Mobi: work with kindlegen executable in `PATH` directories, #401 -- ui: limit the length of option comments in Format Options dialog +- ui: limit the length of option comments in Format Options dialog -- ui_gtk: improvement: show (last) critical error on status bar +- ui_gtk: improvement: show (last) critical error on status bar -- ui_gtk: set intial focus +- ui_gtk: set intial focus -- ui_gtk: improvements in About tab +- ui_gtk: improvements in About tab -- ui_tk: revert most `ttk` widgets to `tk` because the theme doesn't match +- ui_tk: revert most `ttk` widgets to `tk` because the theme doesn't match -- Add SVG icon, #414 by @proletarius101 +- Add SVG icon, #414 by @proletarius101 -- Prevent exception/traceback on Ctrl+C +- Prevent exception/traceback on Ctrl+C -- Optimize progress bar +- Optimize progress bar -- Aard2 slob: show info log before and after `slobWriter.finalize()`, #437 +- Aard2 slob: show info log before and after `slobWriter.finalize()`, #437 -## Removed features +Removed features +---------------- -- Remove read support for Wiktiomary Dump, #48 +- Remove read support for Wiktiomary Dump, #48 -- Remove support for Sdictionary Binary and Source +- Remove support for Sdictionary Binary and Source -## Octopus MDict MDX: features and improvements +Octopus MDict MDX: features and improvements +-------------------------------------------- -- Support MDict V3 fomrat by updating `readmdict`, #385 by @xiaoqiangwang +- Support MDict V3 fomrat by updating `readmdict`, #385 by @xiaoqiangwang -- Fix files created without UUID in header, #387 by @xiaoqiangwang +- Fix files created without UUID in header, #387 by @xiaoqiangwang - - MdxBuilder 4.0 RC2 and before creates files without UUID header + - MdxBuilder 4.0 RC2 and before creates files without UUID header -- Decode mdict title & description if they're bytes, #393 by @tomtung +- Decode mdict title & description if they're bytes, #393 by @tomtung -- `readmdict`: Skip zlib decompress exceptions, #384 +- `readmdict`: Skip zlib decompress exceptions, #384 -- `readmdict`: Use `__name__` as logger name, and add 2 debug logs, #384 +- `readmdict`: Use `__name__` as logger name, and add 2 debug logs, #384 -- `readmdict`: improve exception msg for xxhash, #385 +- `readmdict`: improve exception msg for xxhash, #385 -## XDXF: fixes / imrovements, issue #376 +XDXF: fixes / imrovements, issue #376 +------------------------------------- -- Support `` -- Support embedded tags in `` -- Fix ignoring `` -- Fix extra newlines -- Get rid of warning for `` -- Fix/improve newline and space issues -- Fix and improve tests -- Update url for format description -- Support any tag/string in ``, #396 -- Support reading compressed files directly (`.xdxf.gz`, `.xdxf.bz2`, `.xdxf.lzma`) -- Allow using XSL using `--write-options=xsl=True` -- Update XSL -- Other improvements in XDXF to HTML transformation +- Support `` +- Support embedded tags in `` +- Fix ignoring `` +- Fix extra newlines +- Get rid of warning for `` +- Fix/improve newline and space issues +- Fix and improve tests +- Update url for format description +- Support any tag/string in ``, #396 +- Support reading compressed files directly (`.xdxf.gz`, `.xdxf.bz2`, `.xdxf.lzma`\) +- Allow using XSL using `--write-options=xsl=True` +- Update XSL +- Other improvements in XDXF to HTML transformation -## AppleDict Binary: features, bug fixes, improvements, refactoring +AppleDict Binary: features, bug fixes, improvements, refactoring +---------------------------------------------------------------- -- Fix css name on `html_full=True` +- Fix css name on `html_full=True` -- Fix using `self._encoding` when should use `utf-8` +- Fix using `self._encoding` when should use `utf-8` -- Fix internal links, #343 +- Fix internal links, #343 - - Remove `x-dictionary:d:` prefix from `href` - - First fix for `x-dictionary:r:`: use title if present - - Add `bword://` prefix to `href` (unless it points to http/https) - - Read entry IDs on open and fix links with `x-dictionary:r:` + - Remove `x-dictionary:d:` prefix from `href` + - First fix for `x-dictionary:r:`: use title if present + - Add `bword://` prefix to `href` (unless it points to http/https) + - Read entry IDs on open and fix links with `x-dictionary:r:` -- Add plistlib to dependencies +- Add plistlib to dependencies -- Add tests +- Add tests -- Replace `` with `

` +- Replace `` with `
` -- Fix bad exception formatting +- Fix bad exception formatting -- Fixes from PR #436 +- Fixes from PR #436 -- Support morphology (alternates): #434 by @soshial +- Support morphology (alternates): #434 by @soshial -- Support different AppleDict offsets, #417 by @soshial +- Support different AppleDict offsets, #417 by @soshial -- Extract AppleDict meta-info (langs, title, author), #418 by @soshial +- Extract AppleDict meta-info (langs, title, author), #418 by @soshial -- Progress Bar on `open()` / loading `KeyText.data` +- Progress Bar on `open()` / loading `KeyText.data` -- Improve memory usage of loading `KeyText.data` +- Improve memory usage of loading `KeyText.data` -- Replace `appledict_bin.py` with `appledict_bin` directory and more refactoring +- Replace `appledict_bin.py` with `appledict_bin` directory and more refactoring -## Glossary class (`glossary.py`) +Glossary class (`glossary.py`\) +------------------------------- -- Lots of refactoring in `glossary.py` +- Lots of refactoring in `glossary.py` - - Improve the design and readability - - Reduce complexity of methods - - Move some code into new classes that `Glossary` inherits from - - Improve error messages + - Improve the design and readability + - Reduce complexity of methods + - Move some code into new classes that `Glossary` inherits from + - Improve error messages -- Introduce `glossary_v2.py`, and maintain API backward-compatibility for `glossary.py` (as far as documented) +- Introduce `glossary_v2.py`, and maintain API backward-compatibility for `glossary.py` (as far as documented) - - See [README.md](../../README.md#using-pyglossary-as-a-python-library) for sample code. + - See [README.md](../../README.md#using-pyglossary-as-a-python-library) for sample code. -## Refactoring +Refactoring +----------- -- Fix style errors using `ruff` based on [pyproject.toml](../../pyproject.toml) configuration +- Fix style errors using `ruff` based on [pyproject.toml](../../pyproject.toml) configuration -- Remove all usages of pyglossary.plugins.formats_common +- Remove all usages of pyglossary.plugins.formats_common -- Use `str.startswith(tuple)` and `str.endswith(tuple)` +- Use `str.startswith(tuple)` and `str.endswith(tuple)` -- Reduce complexity of `Glossary` methods +- Reduce complexity of `Glossary` methods -- Rename entry filter `strip` to `trim_whitespaces` +- Rename entry filter `strip` to `trim_whitespaces` -- Some refactoring in StarDict reader +- Some refactoring in StarDict reader -- Use [f-string equal syntax](https://github.com/python/cpython/issues/80998) added in Python 3.8 +- Use [f-string equal syntax](https://github.com/python/cpython/issues/80998) added in Python 3.8 -- Use `str.removeprefix` and `str.removesuffix` added in Python 3.9 +- Use `str.removeprefix` and `str.removesuffix` added in Python 3.9 -- `langs/writing_system.py`: +- `langs/writing_system.py`: - - Change `iso` field to list - - Add new scripts - - Add `getAllWritingSystemsFromText` - - More refactoring + - Change `iso` field to list + - Add new scripts + - Add `getAllWritingSystemsFromText` + - More refactoring -- Split up `TextGlossaryReader.loadInfo` method +- Split up `TextGlossaryReader.loadInfo` method -- `plugin_manager.py`: make some methods private +- `plugin_manager.py`: make some methods private -## Documentation +Documentation +------------- -- Update plugins' documentation +- Update plugins' documentation -- Glossary: add comments about `entryFilters` +- Glossary: add comments about `entryFilters` -- Update `config.rst` +- Update `config.rst` -- Update `doc/entry-filters.md` +- Update `doc/entry-filters.md` -- Update `README.md` +- Update `README.md` -- Update `doc/sort-key.md` +- Update `doc/sort-key.md` -- Update `doc/pyicu.md` +- Update `doc/pyicu.md` -- Update `plugins/testformat.py` +- Update `plugins/testformat.py` -- Add types for arguments and result of all functions/methods +- Add types for arguments and result of all functions/methods -- Add types for r/w options in reader/writer classes +- Add types for r/w options in reader/writer classes -- Fix a few incorrect type annotations +- Fix a few incorrect type annotations -- `README.md`: Add document for adding data entries, #412 +- `README.md`: Add document for adding data entries, #412 -- `README.md`: Fix -> nixos command, #400 by @srghma +- `README.md`: Fix -> nixos command, #400 by @srghma -- Update [bgl_info.md](../babylon/bgl_info.md) and move it from `pyglossary/plugins/babylon_bgl/` to `doc/babylon/` +- Update [bgl_info.md](../babylon/bgl_info.md) and move it from `pyglossary/plugins/babylon_bgl/` to `doc/babylon/` -## Testing +Testing +------- -- Add test for DSL -> Tabfile conversion +- Add test for DSL -> Tabfile conversion -- `dsl_test.py`: fix method names not starting with `test_` +- `dsl_test.py`: fix method names not starting with `test_` -- StarDict reader: better testing for handling definitions with mixed types +- StarDict reader: better testing for handling definitions with mixed types -- StarDict writer: much better testing, coverage of `stardict.py`: from %62 to %83 +- StarDict writer: much better testing, coverage of `stardict.py`: from %62 to %83 -- Refactoring and improvements in tests of Glossary, along with new tests +- Refactoring and improvements in tests of Glossary, along with new tests -- Add test for dictunformat -> Tabfile +- Add test for dictunformat -> Tabfile -- AppleDict (source) tests: validate plist file contents +- AppleDict (source) tests: validate plist file contents -- Allow forking and branching `pyglossary-test` repo +- Allow forking and branching `pyglossary-test` repo - - See [tests/glossary_v2_test.py](../../tests/glossary_v2_test.py#L28) + - See [tests/glossary_v2_test.py](../../tests/glossary_v2_test.py#L28) -- Fix some failing tests on Windows +- Fix some failing tests on Windows -- Slob: test `file_size_approx` +- Slob: test `file_size_approx` -- Test Tabfile -> SQL conversion +- Test Tabfile -> SQL conversion -- Test StarDict error/warning for sortKeyName with and without locale +- Test StarDict error/warning for sortKeyName with and without locale -- Print useful messages for unhandled warnings +- Print useful messages for unhandled warnings -- Improve logs +- Improve logs -- Add `showDiff=False` arg to `compareTextFiles` and `convert` +- Add `showDiff=False` arg to `compareTextFiles` and `convert` -## Packaging +Packaging +--------- -- Update and refactor `Dockerfile` and `run-with-docker.sh` +- Update and refactor `Dockerfile` and `run-with-docker.sh` - - `Dockerfile`: change `WORKDIR` to `/root/home` which is mapped to host's home dir - - `run-with-docker.sh`: create `confDir` before docker build (to check the owner later) - - `run-with-docker.sh`: accept version (image tag) as argument - - Use host's (non-root) user in docker run - - Map host user's `$HOME` to docker's user home - - Re-use existing docker image with same tag + - `Dockerfile`: change `WORKDIR` to `/root/home` which is mapped to host's home dir + - `run-with-docker.sh`: create `confDir` before docker build (to check the owner later) + - `run-with-docker.sh`: accept version (image tag) as argument + - Use host's (non-root) user in docker run + - Map host user's `$HOME` to docker's user home + - Re-use existing docker image with same tag -- Update `setup.py` +- Update `setup.py` diff --git a/doc/releases/4.6.1.md b/doc/releases/4.6.1.md index 8716c5325..8155f3ce9 100644 --- a/doc/releases/4.6.1.md +++ b/doc/releases/4.6.1.md @@ -1,52 +1,59 @@ -# Changes since [4.6.0](./4.6.0.md) +Changes since [4.6.0](./4.6.0.md) +================================= -## Bug fixes +Bug fixes +--------- -- Fix a bug causing broken installation if `~/.local/lib` is a symbolic link +- Fix a bug causing broken installation if `~/.local/lib` is a symbolic link - - or `site-packages` or any of its parents are a symbolic link + - or `site-packages` or any of its parents are a symbolic link -- Fix incompatibilty with Python 3.9 (despite documentation) +- Fix incompatibilty with Python 3.9 (despite documentation) -- Fix `scripts/entry-filters-doc.py`, `scripts/plugin-doc.py` and `doc/entry-filters.md` +- Fix `scripts/entry-filters-doc.py`, `scripts/plugin-doc.py` and `doc/entry-filters.md` -- AppleDict: Fix typos in Chinese language module +- AppleDict: Fix typos in Chinese language module -## Features: +Features: +--------- -- Use environment variable `VERBOSITY` as default (a number from 0 to 5) +- Use environment variable `VERBOSITY` as default (a number from 0 to 5) -## Improvements +Improvements +------------ -- AppleDict Binary: set `html_full=True` by default +- AppleDict Binary: set `html_full=True` by default -- Update `wcwidth` to `0.2.6` +- Update `wcwidth` to `0.2.6` -## Refactoring +Refactoring +----------- -- Add `glos.stripFullHtml(errorHandler)` and use it in 3 plugins +- Add `glos.stripFullHtml(errorHandler)` and use it in 3 plugins - - Add entry filter `StripFullHtml` and change `entry.stripFullHtml()` to return error + - Add entry filter `StripFullHtml` and change `entry.stripFullHtml()` to return error -- Refactor `entryFiltersRules` +- Refactor `entryFiltersRules` -- Remove empty plugin gettext_mo.py +- Remove empty plugin gettext_mo.py -- Remove `glos.titleElement` from `glossary_v2.Glossary` +- Remove `glos.titleElement` from `glossary_v2.Glossary` - - Add to `glossary.Glossary` for compatibility - - `glossary.Glossary` is a wrapper (child class) on top on `glossary_v2.Glossary` + - Add to `glossary.Glossary` for compatibility + - `glossary.Glossary` is a wrapper (child class) on top on `glossary_v2.Glossary` -## Documentation +Documentation +------------- -- Update `doc/entry-filters.md` to list some entry filters that were enabled conditionally (besides config) +- Update `doc/entry-filters.md` to list some entry filters that were enabled conditionally (besides config) -- Remove `sdict.md` and `sdict_source.md` (removed plugins) +- Remove `sdict.md` and `sdict_source.md` (removed plugins) -## Type checking +Type checking +------------- -- Add missing method in `GlossaryType` class -- Fix `mypy` errors on most of code base and some of plugins -- Use builtin types `list, dict, tuple, set` for type annotations -- Replace `Optional[X]` with `X or None` - - will not effect runtime, but type checking now only works with Python 3.10+ +- Add missing method in `GlossaryType` class +- Fix `mypy` errors on most of code base and some of plugins +- Use builtin types `list, dict, tuple, set` for type annotations +- Replace `Optional[X]` with `X or None` + - will not effect runtime, but type checking now only works with Python 3.10+ diff --git a/doc/sort-key.md b/doc/sort-key.md index 2569af69e..062e0de66 100644 --- a/doc/sort-key.md +++ b/doc/sort-key.md @@ -1,26 +1,29 @@ -# Sort Key +Sort Key +======== -## Supported `sortKey` names / `--sort-key` argument values +Supported `sortKey` names / `--sort-key` argument values +-------------------------------------------------------- | Name/Value | Description | Default for formats | Supports locale | -| ---------------------- | ------------------------- | ------------------------------------------------- | :-------------: | -| `headword` | Headword | | Yes | -| `headword_lower` | Lowercase Headword | All other formats (given `--sort`) | Yes | -| `headword_bytes_lower` | ASCII-Lowercase Headword | | No | -| `stardict` | StarDict | [StarDict](./p/stardict.md) | No | -| `ebook` | E-Book (prefix length: 2) | [EPUB-2](./p/epub2.md), [Mobipocket](./p/mobi.md) | No | -| `ebook_length3` | E-Book (prefix length: 3) | | No | -| `dicformids` | DictionaryForMIDs | [DictionaryForMIDs](./p/dicformids.md) | No | -| `random` | Random | | Yes | +|------------------------|---------------------------|---------------------------------------------------|:---------------:| +| `headword` | Headword | | Yes | +| `headword_lower` | Lowercase Headword | All other formats (given `--sort`\) | Yes | +| `headword_bytes_lower` | ASCII-Lowercase Headword | | No | +| `stardict` | StarDict | [StarDict](./p/stardict.md) | No | +| `ebook` | E-Book (prefix length: 2) | [EPUB-2](./p/epub2.md), [Mobipocket](./p/mobi.md) | No | +| `ebook_length3` | E-Book (prefix length: 3) | | No | +| `dicformids` | DictionaryForMIDs | [DictionaryForMIDs](./p/dicformids.md) | No | +| `random` | Random | | Yes | -## Sort Locale +Sort Locale +----------- You can pass an [ICU Locale name/identifier](https://unicode-org.github.io/icu/userguide/locale/) as part of `sortKey` / `--sort-key` value, after a `:` symbol. For example: -- `--sort-key=:fa_IR.UTF-8`: Persian (then case-insensitive Latin) -- `--sort-key=headword:fa_IR.UTF-8`: Persian (then case-sensitive Latin) -- `--sort-key=headword:es`: case-sensitive Spanish -- `--sort-key=headword_lower:es`: case-insensitive Spanish -- `--sort-key=:es`: Spanish (case-insensitive by default) -- `--sort-key=:latn-arab`: first Latin, then Arabic -- `--sort-key=:fa-u-kr-latn-arab`: first Latin, then Persian +- `--sort-key=:fa_IR.UTF-8`: Persian (then case-insensitive Latin) +- `--sort-key=headword:fa_IR.UTF-8`: Persian (then case-sensitive Latin) +- `--sort-key=headword:es`: case-sensitive Spanish +- `--sort-key=headword_lower:es`: case-insensitive Spanish +- `--sort-key=:es`: Spanish (case-insensitive by default) +- `--sort-key=:latn-arab`: first Latin, then Arabic +- `--sort-key=:fa-u-kr-latn-arab`: first Latin, then Persian diff --git a/doc/stardict/stardict_sametypesequence.md b/doc/stardict/stardict_sametypesequence.md index 6817cb2fe..014a65481 100644 --- a/doc/stardict/stardict_sametypesequence.md +++ b/doc/stardict/stardict_sametypesequence.md @@ -1,17 +1,13 @@ -To convert to a StarDict dictionary with the `sametypesequence` option, use -`sametypesequence=[type of definitions]` write option. +To convert to a StarDict dictionary with the `sametypesequence` option, use `sametypesequence=[type of definitions]` write option. -If the sametypesequence option is set, it tells StarDict that each -word's data in the .dict file will have the same sequence of datatypes. -Suppose a dictionary contains phonetic information -and a meaning for each word. The sametypesequence option for this -dictionary would be: +If the sametypesequence option is set, it tells StarDict that each word's data in the .dict file will have the same sequence of datatypes. Suppose a dictionary contains phonetic information and a meaning for each word. The sametypesequence option for this dictionary would be: ``` sametypesequence=tm ``` -# Examples: +Examples: +========= Definitions type is plain text: @@ -25,115 +21,96 @@ Definitions type is HTML: pyglossary mydic.txt mydic.ifo --write-options=sametypesequence=h ``` -# Type identifiers +Type identifiers +================ -Here are the single-character type identifiers that may be used with -the "sametypesequence" option in the .idx file, or may appear in the -dict file itself if the "sametypesequence" option is not used. +Here are the single-character type identifiers that may be used with the "sametypesequence" option in the .idx file, or may appear in the dict file itself if the "sametypesequence" option is not used. -Lower-case characters signify that a field's size is determined by a -terminating `\0`, while upper-case characters indicate that the data -begins with a network byte-ordered guint32 that gives the length of -the following data's size (NOT the whole size which is 4 bytes bigger). +Lower-case characters signify that a field's size is determined by a terminating `\0`, while upper-case characters indicate that the data begins with a network byte-ordered guint32 that gives the length of the following data's size (NOT the whole size which is 4 bytes bigger). -## `m` +`m` +--- -Word's pure text meaning. -The data should be a utf-8 string ending with `\0`. +Word's pure text meaning. The data should be a utf-8 string ending with `\0`. -## `l` +`l` +--- -Word's pure text meaning.
-The data is NOT a utf-8 string, but is instead a string in locale -encoding, ending with `\0`. Sometimes using this type will save disk -space, but its use is discouraged. This is only a idea. +Word's pure text meaning.
The data is NOT a utf-8 string, but is instead a string in locale encoding, ending with `\0`. Sometimes using this type will save disk space, but its use is discouraged. This is only a idea. -## `g` +`g` +--- -A utf-8 string which is marked up with the Pango text markup language.
-For more information about this markup language, See the -[Pango Reference Manual](http://library.gnome.org/devel/pango/stable/PangoMarkupFormat.html).
-You might have it installed locally \[here\](file:///usr/share/gtk-doc/html/pango/PangoMarkupFormat.html) +A utf-8 string which is marked up with the Pango text markup language.
For more information about this markup language, See the [Pango Reference Manual](http://library.gnome.org/devel/pango/stable/PangoMarkupFormat.html).
You might have it installed locally \[here\](file:///usr/share/gtk-doc/html/pango/PangoMarkupFormat.html) -## `t` +`t` +--- -English phonetic string.
-The data should be a utf-8 string ending with `\0`. +English phonetic string.
The data should be a utf-8 string ending with `\0`. -Here are some utf-8 phonetic characters:
-`ΞΈΚƒΕ‹Κ§Γ°Κ’Γ¦Δ±ΚŒΚŠΙ’Ι›Ι™Ι‘ΙœΙ”ΛŒΛˆΛΛ‘αΉƒαΉ‡αΈ·`
-`Γ¦Ι‘Ι’ΚŒΣ™Ρ”Ε‹vΞΈΓ°ΚƒΚ’ΙšΛΙ‘ΛΛŠΛ‹` +Here are some utf-8 phonetic characters:
`ΞΈΚƒΕ‹Κ§Γ°Κ’Γ¦Δ±ΚŒΚŠΙ’Ι›Ι™Ι‘ΙœΙ”ΛŒΛˆΛΛ‘αΉƒαΉ‡αΈ·`
`Γ¦Ι‘Ι’ΚŒΣ™Ρ”Ε‹vΞΈΓ°ΚƒΚ’ΙšΛΙ‘ΛΛŠΛ‹` -## `x` +`x` +--- -A utf-8 string which is marked up with the [xdxf language](https://github.com/soshial/xdxf_makedict).
-StarDict has these extensions: +A utf-8 string which is marked up with the [xdxf language](https://github.com/soshial/xdxf_makedict).
StarDict has these extensions: -- `` can have "type" attribute, it can be "image", "sound", "video" - and "attach". -- `` can have "k" attribute. +- `` can have "type" attribute, it can be "image", "sound", "video" and "attach". +- `` can have "k" attribute. -## `y` +`y` +--- -Chinese YinBiao or Japanese KANA.
-The data should be a utf-8 string ending with `\0`. +Chinese YinBiao or Japanese KANA.
The data should be a utf-8 string ending with `\0`. -## `k` +`k` +--- -[KingSoft](https://en.wikipedia.org/wiki/Kingsoft) [PowerWord](https://en.wikipedia.org/wiki/PowerWord)'s data. -The data is a utf-8 string ending with `\0`. And it's in XML format. +[KingSoft](https://en.wikipedia.org/wiki/Kingsoft) [PowerWord](https://en.wikipedia.org/wiki/PowerWord)'s data. The data is a utf-8 string ending with `\0`. And it's in XML format. -## `w` +`w` +--- [MediaWiki markup language](https://www.mediawiki.org/wiki/Help:Formatting). -## `h` +`h` +--- Html codes. -## `n` +`n` +--- WordNet data. -## `r` +`r` +--- -Resource file list.
-The content can be: +Resource file list.
The content can be: -- `img:pic/example.jpg` Image file -- `snd:apple.wav` Sound file -- `vdo:film.avi` Video file -- `att:file.bin` Attachment file +- `img:pic/example.jpg` Image file +- `snd:apple.wav` Sound file +- `vdo:film.avi` Video file +- `att:file.bin` Attachment file -More than one line is supported as a list of available files.
-StarDict will find the files in the Resource Storage.
-The image will be shown, the sound file will have a play button.
-You can "save as" the attachment file and so on.
-The file list must be a utf-8 string ending with `\0`.
-Use `\n` for separating new lines.
-Use `/` character as directory separator.
+More than one line is supported as a list of available files.
StarDict will find the files in the Resource Storage.
The image will be shown, the sound file will have a play button.
You can "save as" the attachment file and so on.
The file list must be a utf-8 string ending with `\0`.
Use `\n` for separating new lines.
Use `/` character as directory separator.
-## `W` +`W` +--- -`.wav` audio file.
-The data begins with a network byte-ordered guint32 to identify the wav -file's size, immediately followed by the file's content. -This is only a idea, it is better to use `r` Resource file list in most -case. +`.wav` audio file.
The data begins with a network byte-ordered guint32 to identify the wav file's size, immediately followed by the file's content. This is only a idea, it is better to use `r` Resource file list in most case. -## `P` +`P` +--- -Picture file.
-The data begins with a network byte-ordered guint32 to identify the picture -file's size, immediately followed by the file's content.
-This feature is implemented, as stardict-advertisement-plugin needs it. -Anyway, it is better to use `r` Resource file list in most case. +Picture file.
The data begins with a network byte-ordered guint32 to identify the picture file's size, immediately followed by the file's content.
This feature is implemented, as stardict-advertisement-plugin needs it. Anyway, it is better to use `r` Resource file list in most case. -## `X` +`X` +--- This type identifier is reserved for experimental extensions. -# For more information +For more information +==================== -Refer to StarDict documentations at: -[https://github.com/huzheng001/stardict-3/blob/master/dict/doc/StarDictFileFormat](https://github.com/huzheng001/stardict-3/blob/master/dict/doc/StarDictFileFormat) +Refer to StarDict documentations at: [https://github.com/huzheng001/stardict-3/blob/master/dict/doc/StarDictFileFormat](https://github.com/huzheng001/stardict-3/blob/master/dict/doc/StarDictFileFormat) diff --git a/doc/term-colors.md b/doc/term-colors.md index 1776f7583..cf7e62062 100644 --- a/doc/term-colors.md +++ b/doc/term-colors.md @@ -1,7 +1,8 @@ -## Terminal / ANSI Colors +Terminal / ANSI Colors +---------------------- | Sample | Code | Hex | RGB | HSL | -| ----------------------------------------------------------- | ---- | --------- | ------------- | ----------------- | +|-------------------------------------------------------------|------|-----------|---------------|-------------------| | ![](https://via.placeholder.com/60x30/000000/000000?text=+) | 0 | `#000000` | 0, 0, 0 | 0, 0, 0 | | ![](https://via.placeholder.com/60x30/aa0000/000000?text=+) | 1 | `#aa0000` | 170, 0, 0 | 0, 1, 0.333 | | ![](https://via.placeholder.com/60x30/00aa00/000000?text=+) | 2 | `#00aa00` | 0, 170, 0 | 120, 1, 0.333 | diff --git a/doc/termux.md b/doc/termux.md index bf8aa3a4d..1aa57a277 100644 --- a/doc/termux.md +++ b/doc/termux.md @@ -1,35 +1,36 @@ -## Feature-specific Requirements on [Termux](https://github.com/termux/termux-app) +Feature-specific Requirements on [Termux](https://github.com/termux/termux-app) +------------------------------------------------------------------------------- -- **Using `--remove-html-all` flag** +- **Using `--remove-html-all` flag** - - `apt install libxml2 libxslt` - - `pip install lxml beautifulsoup4` + - `apt install libxml2 libxslt` + - `pip install lxml beautifulsoup4` -- **Reading from FreeDict, XDXF, JMDict, AppleDict Binary (.dictionary) or CC-CEDICT** +- **Reading from FreeDict, XDXF, JMDict, AppleDict Binary (.dictionary) or CC-CEDICT** - - `apt install libxml2 libxslt` - - `pip install lxml` + - `apt install libxml2 libxslt` + - `pip install lxml` -- **Reading from cc-kedict** +- **Reading from cc-kedict** - - `apt install libxml2 libxslt` - - `pip install lxml PyYAML` + - `apt install libxml2 libxslt` + - `pip install lxml PyYAML` -- **Reading or writing Aard 2 (.slob)** +- **Reading or writing Aard 2 (.slob)** - - `pkg install libicu` - - `pip install PyICU` + - `pkg install libicu` + - `pip install PyICU` -- **Writing to Kobo E-Reader Dictionary** +- **Writing to Kobo E-Reader Dictionary** - - `pip install marisa-trie` + - `pip install marisa-trie` -- **Reading from Zim** +- **Reading from Zim** - - `apt install libzim` - - `pip install libzim` + - `apt install libzim` + - `pip install libzim` -- **Writing to AppleDict** +- **Writing to AppleDict** - - `apt install libxml2 libxslt` - - `pip install lxml beautifulsoup4 html5lib` + - `apt install libxml2 libxslt` + - `pip install lxml beautifulsoup4 html5lib`