Releases: VikParuchuri/surya
Fix pytorch 2.6 bug
Fix bug that caused issues on MPS (Mac) devices when using pytorch 2.6.
Pin pytorch
Pytorch 2.6.0 doesn't work well with some of the models on MPS (Mac), so pinning to the old version.
Add LaTeX OCR model
New OCR model and streamlit app
- Release a new LaTeX OCR model
- Add streamlit app to interactively select and OCR equations
![image](https://private-user-images.githubusercontent.com/913340/407825751-6d0065bb-577f-442c-8ecc-77c24a50ef2e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzg4NjQ0OTMsIm5iZiI6MTczODg2NDE5MywicGF0aCI6Ii85MTMzNDAvNDA3ODI1NzUxLTZkMDA2NWJiLTU3N2YtNDQyYy04ZWNjLTc3YzI0YTUwZWYyZS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUwMjA2JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MDIwNlQxNzQ5NTNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0wNGM1MGZmN2U2NDkzMTgxZDhjNGMwNzI0OGRhMDU3YTdlZDkxOGJiNDIwNWE4MTExM2Q0NDFlMTEyM2M2NDU3JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.DCAoEnhpqWR3PVgM9tHmj10uo3EY9hS-2K7HRxmAD_o)
What's Changed
- Improve typing for
PolygonBox.bbox
by @kevinhu in #291 - Add LaTeX OCR by @VikParuchuri in #292
- Texify by @VikParuchuri in #295
- Final texify version by @VikParuchuri in #296
- Integrate new latex OCR model by @VikParuchuri in #293
New Contributors
Full Changelog: v0.9.3...v0.10.0
Fix cli script issue
Fix issue with cli scripts and folders.
Better polygon type checking
Improve how polygons are type checked in the schema.
Fix rowspan bug
Fixes a bug where rowspans weren't included in table model predictions.
Refactor surya; new table recognition model
Refactor
This is a complete refactor of surya - the code is now cleaner and better organized. Models are now imported and used differently, here is an example for OCR:
from PIL import Image
from surya.recognition import RecognitionPredictor
from surya.detection import DetectionPredictor
image = Image.open(IMAGE_PATH)
langs = ["en"] # Replace with your languages or pass None (recommended to use None)
recognition_predictor = RecognitionPredictor()
detection_predictor = DetectionPredictor()
predictions = recognition_predictor([image], [langs], detection_predictor)
See the README for how to use other models.
Table recognition
There is a new table recognition model which detects colspans/rowspans better, along with header cells. It also isn't as complex to use, since it operates on just the images versus the images and bboxes.
What's Changed
- Layout improvements by @VikParuchuri in #267
- New table model; total refactor by @VikParuchuri in #279
- Add ci workflow by @VikParuchuri in #284
Full Changelog: v0.8.3...v0.9.0
Pin pypdfium2
Pin pypdfium2 version - newest version can cause issues.
New layout model
Layout model is twice as fast and more accurate.
What's Changed
- Update layout model by @VikParuchuri in #270
Full Changelog: v0.8.1...v0.8.2
Add bad OCR detection model
- Add a model to detect bad OCR text
- Add top_k predictions to layout
- Add in test suite
What's Changed
- Add OCR Error Detection Model by @tarun-menta in #261
- Add
top_k
to Surya Layout and Fix Confidence Value Issue by @iammosespaulr in #263 - Bad OCR detection model by @VikParuchuri in #268
New Contributors
- @tarun-menta made their first contribution in #261
Full Changelog: v0.8.0...v0.8.1