1

I have this image that was cropped from another image and I want to give this image as an input to image_to_string method:

import pytesseract
import cv2
num_plate = cv2.imread('E:\Images\car_plate222.jpeg' , cv2.IMREAD_GRAYSCALE)
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
cv2.dilate(num_plate, (15, 15), num_plate)
pytesseract.image_to_string(num_plate)

Here's the photo: Car Plate:
Car Plate

I used dilation for better performance, but it doesn't give me desired output (Sometimes gives me empty string and sometimes gives me weird output)

Does anybody know what's wrong?

Talha Rahman
  • 720
  • 4
  • 12
  • 27
Kasra
  • 11
  • 1
  • 2
  • So there's no actual bug, but rather a poor OCR performance of the function `image_to_string`, is that right? – arnaud Mar 10 '20 at 10:40
  • Did you look at a similar question here https://stackoverflow.com/questions/54561913/tesseract-image-to-string-is-empty ? – arnaud Mar 10 '20 at 10:42
  • There are multiple parameters you can try, e.g. `--psm`. See https://stackoverflow.com/questions/44619077/pytesseract-ocr-multiple-config-options – arnaud Mar 10 '20 at 10:45
  • @Arnaud Thanks for your response.I tried various " --psm "s but it didn't work out and many times it returned "ili" as an output,I don't know why.And yes I think this Error is related to OCR. – Kasra Mar 10 '20 at 11:21

1 Answers1

0

You must threshold the image before passing it to pytesseract. That increases the accuracy. Here is a sample:

import cv2
import numpy as np
import pytesseract
from PIL import Image

# Grayscale image
img = Image.open('E:\\WorkDir\\KAVSEE\\Python\\test.jpg').convert('L')  
ret,img = cv2.threshold(np.array(img), 125, 255, cv2.THRESH_BINARY)

# Older versions of pytesseract need a pillow image
# Convert back if needed
img = Image.fromarray(img.astype(np.uint8))

print(pytesseract.image_to_string(img))

Hope this helps :)

amain
  • 1,668
  • 13
  • 19
mytkavish
  • 97
  • 1
  • 6