1

We are trying to use easyocr instead of (py)tesseract. Its output quality is superb, but there is an unusual problem. It does not work with Python's multiprocessing library, but does with torch's multiprocessing. For clarity, here is an example of code that does not work, followed by code that does.

Code that does not work:

import multiprocessing as mp
import easyocr
import cv2
def ocr_test(reader, img):  
    result = reader.readtext(img, detail = 0)
    print(result)
    
reader = easyocr.Reader(['en'], gpu=False)
img = cv2.imread('./datasets/easyocrtest/sampleocrimage.png')
p = mp.Process(target=ocr_test, args=(reader,img))
p.start()
p.join()

Code that works:

from torch.multiprocessing import Process
import easyocr
import cv2
def ocr_test(reader, img):
    result = reader.readtext(img, detail = 0)
    print(result)

reader = easyocr.Reader(['en'], gpu=False)
img = cv2.imread('./datasets/easyocrtest/sampleocrimage.png')
p = Process(target=ocr_test(reader, img))
p.start()
p.join()

Is there something special about easyocr that requires torch's multiprocessing? On the face of it, it does seem so. Or have we missed something obvious? I am unable to attach a file with an image to make it easy for someone to replicate, but any png file that has text should work. Thanks.

user4562262
  • 95
  • 1
  • 9

0 Answers0