Pytesseract image_to_string empty output

Question

I have this image that was cropped from another image and I want to give this image as an input to image_to_string method:

import pytesseract
import cv2
num_plate = cv2.imread('E:\Images\car_plate222.jpeg' , cv2.IMREAD_GRAYSCALE)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
cv2.dilate(num_plate, (15, 15), num_plate)
pytesseract.image_to_string(num_plate)

Here's the photo: Car Plate:

I used dilation for better performance, but it doesn't give me desired output (Sometimes gives me empty string and sometimes gives me weird output)

Does anybody know what's wrong?

So there's no actual bug, but rather a poor OCR performance of the function `image_to_string`, is that right? — arnaud, Mar 10 '20 at 10:40
Did you look at a similar question here https://stackoverflow.com/questions/54561913/tesseract-image-to-string-is-empty ? — arnaud, Mar 10 '20 at 10:42
There are multiple parameters you can try, e.g. `--psm`. See https://stackoverflow.com/questions/44619077/pytesseract-ocr-multiple-config-options — arnaud, Mar 10 '20 at 10:45
@Arnaud Thanks for your response.I tried various " --psm "s but it didn't work out and many times it returned "ili" as an output,I don't know why.And yes I think this Error is related to OCR. — Kasra, Mar 10 '20 at 11:21

score 0 · Answer 1 · edited Apr 24 '20 at 22:08

You must threshold the image before passing it to pytesseract. That increases the accuracy. Here is a sample:

import cv2
import numpy as np
import pytesseract
from PIL import Image

# Grayscale image
img = Image.open('E:\\WorkDir\\KAVSEE\\Python\\test.jpg').convert('L')  
ret,img = cv2.threshold(np.array(img), 125, 255, cv2.THRESH_BINARY)

# Older versions of pytesseract need a pillow image
# Convert back if needed
img = Image.fromarray(img.astype(np.uint8))

print(pytesseract.image_to_string(img))

Hope this helps :)

Pytesseract image_to_string empty output

1 Answers1