Releases: jitsi/jiwer
fix SyntaxError in 3.12
What's Changed
- test on 3.12 by @nikvaessen in #96
- Fixes in
transforms.py
by @benglewis in #94
New Contributors
- @benglewis made their first contribution in #94
Full Changelog: v3.04...v3.0.5
allow --cer and --global in CLI
Full Changelog: v3.0.3...v3.04
v3.0.3 - update rapidfuzz
Full Changelog: v3.0.2...v3.0.3
v3.0.2
What's Changed
- add option to skip correct pairs in visualization by @nikvaessen in #79
Full Changelog: v3.0.1...v3.0.2
v3.0.1
What's Changed
- fix docstring by @nikvaessen in #75
- fix bug in deprecation of truth by @nikvaessen in #77
Minor release for fixing #76 .
Full Changelog: v3.0.0...v3.0.1
v3.0.0
What's Changed
This release makes breaking changes to the jiwer API.
First, we introduce 3 new methods:
1.jiwer.compute_measures()
is renamed to jiwer.process_words
, and returns everything in a dataclass
named WordOutput
.
2.jiwer.cer(return_dict=True)
is deprecated, and is superseded by jiwer.process_characters
, which returns everything in a dataclass
named CharacterOutput
3. jiwer.visualize_measures
is renamed to jiwer.visualize_alignment
. Moreover, the keyword argument visualize_cer: bool = False
has been removed, and the output
keyword argument is now of expected type Union[WordOutput, CharacterOutput]
.
I've also decided to rename all mentions of the concept "(ground)truth" to "reference", in the light of the Whisper speech-to-text model showing that future ASR models might not trained on something like a "ground truth". Therefore, in the following methods, the keyword arguments truth
and truth_transform
have been renamed to reference
and reference_transform
:
jiwer.cer()
jiwer.mer()
jiwer.wer()
jiwer.wil()
jiwer.wip()
The alignments are now stored as a list of lists containing jiwer.AlignmentChunk
dataclass objects instead of hard-to-document tuples.
Lastly, I've added jiwer.transformations.cer_contiguous
for optionally calculating the CER
with uneven amount of reference and hypothesis sentences. I've also changed the wer_standardize
and wer_standardize_contiguous
so that the last 3 transformations are now:
tr.Strip(),
tr.ReduceToSingleSentence(),
tr.ReduceToListOfListOfWords(),
This releases also introduced a documentation website. See https://jitsi.github.io/jiwer.
Full Changelog: v2.6.0...v3.0.0
v2.6.0 - jiwer CLI + alignment and visualisation
What's Changed
The return dictionary of jiwer.cer()
and jiwer.compute_measures()
now has 3 addional keys: ops
, truth
, and hypothesis
. See the alignment section of the README, and the doc-strings of the methods, for more details.
Also adds the jiwer.visualize_measures()
to visualize the alignment of all ground-truth/hypothesis pairs.
Finally, the jiwer
command is automatically installed upon installation of jiwer, which provides a simple CLI for interacting with jiwer.
Commit list:
- Alignments and a CLI interface by @nikvaessen in #72
Full Changelog: v2.5.2...v2.6.0
v2.5.2
Performance improvement for RemovePunctuation
Bug fixes and deprecation removal
What's Changed
- Handle non-ascii punctuation in RemovePunctuation transform by @nikvaessen in #63
- Fix bug in RemoveSpecificWords matching on partials by @nikvaessen in #64
- Remove depricated keywords
standardize
andwords_to_filter
by @nikvaessen in #65
Full Changelog: v2.4.0...v.2.5.0