Recognize with Timeout not stopping #290

Ibariu · 2021-12-10T08:30:03Z

Hi everyone,

I have the next Image that I am trying to extract the text on it.

The expected value would some '\n' or blank spaces. The problem comes when I try to proccess it with the recognize function to stop the process in case it delays too much time but it does not stop and stays for 3h until the recognize function comes back with a False.

To set the image on the API I am using setImageFile, as far as I know it could avoid some trouble when loading it into the api (although I have also used "setImage(Image.open('image2test.jpg'))" ). Also mention that this page is being processed next to other pages extracted from a same PDF file. From this file PDF, of 2 pages, this page is the only one giving problems, causing the TesseractOCR 3h to extract its text. Considering that the image has no relevant information the process must be stopped and not processed. That page can not be deleted before the OCR, having into account that a lot of PDF might be processed and this isolated case may repeat in a unknown future wanting to catch this possible error.

The code I am using is the next:

Here you have more information of the API configuration, Tesseract or PIL version.

PSM: AUTO_OSD (1)
OEM: DEFAULT (3)
LANG: Spanish ('spa')
Tesseract: 4.1.1
Tesserocr: 2.5.1
Pillow: 8.4.0

Thanks for all the help 😉

idromv · 2025-01-31T22:15:03Z

Hi!

I think I had a similar issue (really bad).

It took me a long while, but I finally managed to make it work by imposing Python to work on one thread at a time.

It's enough to add the following at the beginning of your script:

import os
os.environ['OMP_THREAD_LIMIT'] = '1'

As far as I am concerned, it seems to be also working under multithreading, as long as the process under multithreading is not directly related to the function for which you use the Recognize (e.g. you run the multithreading on a bigger function, which at some point calls the function using the api).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recognize with Timeout not stopping #290

Recognize with Timeout not stopping #290

Ibariu commented Dec 10, 2021

idromv commented Jan 31, 2025

Recognize with Timeout not stopping #290

Recognize with Timeout not stopping #290

Comments

Ibariu commented Dec 10, 2021

idromv commented Jan 31, 2025