This minor release includes: improvement of the documentation thanks to @felixdittrich92, bugs fixed, support of rotation extended to Tensorflow backend, a switch from PyMuPDF to pypdfmium2 and a nice integration to the Hugginface Hub thanks to @fg-mindee !

Note: doctr 0.5.0 requires either TensorFlow 2.4.0 or PyTorch 1.8.0.

Highlights

Improvement of the documentation

The documentation has been improved adding a new theme, illustrations, and docstring has been completed and developed.
This how it renders:

Rotated text detection extended to Tensorflow backend

We provide weights for the linknet_resnet18_rotation model which has been deeply modified: We implemented a new loss (based on Dice Loss and Focal Loss), we changed the computation of the targets so that polygons are shrunken the same way they are in the DBNet which improves highly the precision of the segmenter and we trained the model preserving the aspect ratio of the images.
All these improvements led to much better results, and the pretrained model is now very robust.

Preserving the aspect ratio in the detection task

You can now choose to preserve the aspect ratio in the detection_predictor:

>>> from doctr.models import detection_predictor
>>> predictor = detection_predictor('db_resnet50_rotation', pretrained=True, assume_straight_pages=False, preserve_aspect_ratio=True)

This option can also be activated in the high level end-to-end predictor:

>>> from doctr.model import ocr_predictor
>>> model = ocr_predictor('linknet_resnet18_rotation', pretrained=True, assume_straight_pages=False, preserve_aspect_ratio=True)

Integration within the HugginFace Hub

The artefact detection model is now available on the HugginFace Hub, this is amazing:

On DocTR, you can now use the .from_hub() method so that those 2 snippets are equivalent:

# Pretrained
from doctr.models.obj_detection import fasterrcnn_mobilenet_v3_large_fpn
model = fasterrcnn_mobilenet_v3_large_fpn(pretrained=True)

and:

# HF Hub
from doctr.models.obj_detection.factory import from_hub
model = from_hub("mindee/fasterrcnn_mobilenet_v3_large_fpn")

Breaking changes

Replacing the PyMuPDF dependency with pypdfmium2 which is license compatible

We replaced for the PyMuPDF dependency with pypdfmium2 for a license-compatibility issue, so we loose the word and objects extraction from source pdf which was done with PyMuPDF. It wasn't used in any models so it is not a big issue, but anyway we will work in the future to re-integrate such a feature.

Full changelog

What's Changed

Breaking Changes 🛠

fix: polygon orientation + line aggregation by @charlesmindee in #801
refactor: Switched from PyMuPDF to pypdfium2 by @fg-mindee in #829

New Features

feat: Added RandomHorizontalFLip in TF by @SiddhantBahuguna in #779
Imgur5k dataset integration by @felixdittrich92 in #785
feat: Added support of GPU for predictors in PyTorch by @fg-mindee in #808
Add SynthWordGenerator to text reco training scripts by @felixdittrich92 in #825
fix: Fixed some ResNet architecture imprecisions by @fg-mindee in #828
feat: Added shadow augmentation for all backends by @fg-mindee in #811
feat: Added loading method for PyTorch artefact detection models from HF Hub by @fg-mindee in #836
feat: add rotated linknet_resnet18 tensorflow ckpts by @charlesmindee in #817

Bug Fixes

fix: Fixed rotation of img + target by @fg-mindee in #784
fix: show sample when batch size is 1 by @charlesmindee in #787
ci: Fixed PR label check job by @fg-mindee in #792
ci: Fixed typo in the script ref by @fg-mindee in #794
[datasets] fix description by @felixdittrich92 in #795
fix: linknet target computation by @charlesmindee in #803
ci: Fixed issue templates by @fg-mindee in #806
fix: Reverted mistake in demo by @fg-mindee in #810
Restore remap boxes by @Rob192 in #812
fix: Fixed SAR model for training and inference in PyTorch by @fg-mindee in #831
fix: Fixed expand_line for horizontal & vertical cases by @fg-mindee in #842
fix: Fixes inplace target modifications for AbstractDatasets by @fg-mindee in #848
fix: Fixed landing page and title underlines by @fg-mindee in #860
docs: Fixed HTML title by @fg-mindee in #864

Improvements

docs: Updated headers of python files by @fg-mindee in #781
[datasets] unify np_dtype and fix comments by @felixdittrich92 in #782
fix: Clip in rotation transform + eval_straight mode for training by @charlesmindee in #786
refactor: Avoids instantiating orientation predictor when unnecessary by @fg-mindee in #809
feat: add straight-eval arg in evaluate script by @charlesmindee in #793
feat: add dice loss in linknet by @charlesmindee in #816
feat: add shrinked target in linknet + dilation in postprocessing by @charlesmindee in #822
feat: replace bce by focal loss in linknet loss by @charlesmindee in #824
docs: add rotation in docs by @charlesmindee in #846
feat: add aspect ratio for ocr predictor by @charlesmindee in #835
feat: add target to resize transform for aspect ratio training (detection task) by @charlesmindee in #823
update bug report ticket with Active backend field by @felixdittrich92 in #853
Theme + css #1 by @felixdittrich92 in #856
docs: Adds illustration in the docstrings of doctr.datasets by @felixdittrich92 in #857
docs: Updated docstrings of io, transforms & utils by @felixdittrich92 in #859
docs: Updated folder hierarchy of doc source and nootbooks to rst file by @felixdittrich92 in #862
Doc models #5 by @felixdittrich92 in #861
fix: linknet hyperparameters postprocessing + demo for rotation model by @charlesmindee in #865

Miscellaneous

chore: Applied post release modifications by @fg-mindee in #780
Switch to new pypdfium2 API by @mara004 in #845

New Contributors

@mara004 made their first contribution in #845

Full Changelog: v0.5.0...v0.5.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.1