forked from ocrmypdf/OCRmyPDF
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
154 additions
and
52 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
.. SPDX-FileCopyrightText: 2023 James R. Barlow | ||
.. SPDX-License-Identifier: CC-BY-SA-4.0 | ||
============ | ||
Design notes | ||
============ | ||
|
||
Why doesn't OCRmyPDF use PyTesseract? | ||
===================================== | ||
|
||
PyTesseract is a Python wrapper around the Tesseract OCR engine. When OCRmyPDF was | ||
first written, PyTesseract used ABI bindings to call the Tesseract library. This | ||
was not a good fit for OCRmyPDF because ABI bindings can be fragile. | ||
|
||
PyTesseract has since evolved calling the Tesseract executable, abandoning the ABI | ||
approach and using the CLI instead, just like OCRmyPDF does. If it were written from | ||
scratch today, OCRmyPDF might use PyTesseract. | ||
|
||
PyTesseract has more features don't particularly need PDF output, but less features | ||
than OCRmyPDF's API for creating PDFs. | ||
|
||
What is ``executor()``? | ||
======================= | ||
|
||
OCRmyPDF uses a custom concurrent executor which can support either threads or | ||
processes with the same interface. This is useful because OCRmyPDF can use | ||
either threads or processes to parallelize work, whichever is more appropriate | ||
for the task at hand. | ||
|
||
The interface is currently private and subject to change. In particular, if | ||
experiments with asyncio and anyio are successful, the interface will change. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -44,6 +44,7 @@ image processing and OCR to existing PDFs. | |
api | ||
plugins | ||
apiref | ||
design_notes | ||
contributing | ||
maintainers | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters