-
-
Notifications
You must be signed in to change notification settings - Fork 51
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Adjustement in frequencies.json about Chinese Remove latin based char in it * Added the possibility to list encoding aliases for a match Encoding name are known by many name, using this could help when searching for IBM855 when it's listed as CP855. * Added submatch in match list of submatch that produce the EXACT same output as a match * Changes in docs + comment unused code. * Add param in doc ProbeChaos giveup_threshold * Doc improvement in unicode.py * Add static method list_by_range in unicode.py Sort letters by unicode range in a dict * ProbeCoherence reliability improved Can now probe & sort by alphabet used or unicode range. * Added coherence_non_latin method in NormalizerMatch Verify if a non latin based language got verified by probe coherence * CLI is now more verbose * More tests, yay ! * bump 1.0.0 * readme upd8
- Loading branch information
Showing
12 changed files
with
312 additions
and
74 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.