Pytesseract with custom font incorrectly classifying numbers

Question

I am trying to detect prices using pytesseract.

However I am having very bad results.

I have one large image with several prices in different locations. These locations are constant so I am cropping the image down and saving each area as a new image and then trying to detect the text.

I know the text will only contain 0123456789$¢.

I trained my new font using trainyourtesseract.com.

For example, I take this image.

sdf

Double it's size, and threshold it to get this.

sdf

Run it through tesseract and get an output of 8.

Any help would be appreciated.

def getnumber(self, img):
   grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
   thresh, grey = cv2.threshold(grey, 50, 255, cv2.THRESH_BINARY_INV)

   filename = "{}.png".format(os.getpid())
   cv2.imwrite(filename, grey)

   text = pytesseract.image_to_string(Image.open(filename), lang='Droid',
                                      config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789.$¢')
   os.remove(filename)
   return(text)

score 2 · Accepted Answer · answered Feb 11 '20 at 02:29

You're on the right track. When preprocessing the image for OCR, you want to get the text in black with the background in white. The idea is to enlarge the image, Otsu's threshold to get a binary image, then perform OCR. We use --psm 6 to tell Pytesseract to assume a single uniform block of text. Look here for more configuration options. Here's the processed image:

Result from OCR:

2¢

Code

import cv2
import pytesseract
import imutils

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

# Resize, grayscale, Otsu's threshold
image = cv2.imread('1.png')
image = imutils.resize(image, width=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Perform text extraction
data = pytesseract.image_to_string(thresh, lang='eng',config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imwrite('thresh.png', thresh)
cv2.waitKey()

Machine specs:

Windows 10
opencv-python==4.2.0.32
pytesseract==0.2.7
numpy==1.14.5

Perfect thank you, seems to work now. Is 500 pixels wide a good general size or is it just a random value you picked for these images? — Sefton de Pledge, Feb 11 '20 at 05:33
Its a random value, the main point is to enlarge the image to improve detection. From my experience at least greater then `300` pixels is good. Its a just a good general size — nathancy, Feb 11 '20 at 20:41

Pytesseract with custom font incorrectly classifying numbers

1 Answers1