-
Notifications
You must be signed in to change notification settings - Fork 2
HOWTO
dbrenk edited this page Jul 1, 2015
·
3 revisions
There are different modes how to use this Program:
-
OCR mode: has 4 parameters: lang (-DEU/-ENG), mode (-OCR/-READ/-AUTO), infile (PDF/BMP/TIF/PNG) and outfile (TXT)
- OCR : using tesseract for character recognition
- READ : using pdfbox to extract text from "text-based" PDF files
- AUTO : tries to extract text with pdfbox and fallback to tesseract if it is not a text-based PDF file
-
WhiteOnWhite Overlay mode: has 5 parameters: mode (-OVERLAY), infile (PDF), outfile (PDF), text (String) and color (Color.WHITE/Color.BLACK) The overlay text is written on every page of the original document, the original is not changed