Releases · jawah/charset_normalizer

21 Sep 16:17

Ousret

1.1.0

9728ff7

Charset Normalizer

Changes :

Bugfix : Sequence having lenght bellow 10 chars was not checked by ProbeChaos at all. (#14)
Bugfix : Legacy detect method inspired by chardet was not returning intended result when having no result. (#14)

Assets 2

17 Sep 17:17

Ousret

1.0.0

d3996ce

Charset Normalizer

Release 1.0.0 (#11)

* Adjustement in frequencies.json about Chinese

Remove latin based char in it

* Added the possibility to list encoding aliases for a match

Encoding name are known by many name, using this could help when searching for IBM855 when it's listed as CP855.

* Added submatch in match

list of submatch that produce the EXACT same output as a match

* Changes in docs

+ comment unused code.

* Add param in doc ProbeChaos giveup_threshold

* Doc improvement in unicode.py

* Add static method list_by_range in unicode.py

Sort letters by unicode range in a dict

* ProbeCoherence reliability improved 

Can now probe & sort by alphabet used or unicode range.

* Added coherence_non_latin method in NormalizerMatch

Verify if a non latin based language got verified by probe coherence

* CLI is now more verbose

* More tests, yay !

* bump 1.0.0

* readme upd8

Assets 2