4

I am starting to learn OpenCV and Tesseract, and have trouble with what seems to be a very simple example.

Here is an image that I am trying to OCR, that reads "171 m":

original image

I do some preprocessing. Since blue is the dominant color of the text, I extract the blue channel and apply simple thresholding.

img = cv2.imread('171_m.png')[y, x, 0]
_, thresh = cv2.threshold(img, 150, 255, cv2.THRESH_BINARY_INV)

The resulting image looks like this:

blue channel, simple threshold

Then throw that into Tesseract, with psm 7 for single line:

text = pytesseract.image_to_string(thresh, config='--psm 7')
print(text)
>>> lim

I also tried to restrict possible characters, and it gets a bit better, but not quite.

text = pytesseract.image_to_string(thresh, config='--psm 7 -c tessedit_char_whitelist=1234567890m')
print(text)
>>> 17m
OpenCV v4.1.1.
Tesseract v5.0.0-alpha.20190708

Any help appreciated.

nathancy
  • 42,661
  • 14
  • 115
  • 137
Anton Babkin
  • 595
  • 1
  • 8
  • 12

3 Answers3

4

Before throwing the image into Pytesseract, preprocessing can help. The desired text should be in black while the background should be in white. Here's an approach

  • Convert image to grayscale and enlarge image
  • Gaussian blur
  • Otsu's threshold
  • Invert image

After converting to grayscale, we enlarge the image using imutils.resize() and Gaussian blur. From here we Otsu's threshold to get a binary image

enter image description here

If you have noisy images, an additional step would be to use morphological operations to smooth or remove noise. But since your image is clean enough, we can simply invert the image to get our result

enter image description here

Output from Pytesseract using --psm 6

171m

import cv2
import pytesseract
import imutils

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread('1.png',0)
image = imutils.resize(image, width=400)
blur = cv2.GaussianBlur(image, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
result = 255 - thresh 

data = pytesseract.image_to_string(result, lang='eng',config='--psm 6')
print(data)

cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.waitKey()
nathancy
  • 42,661
  • 14
  • 115
  • 137
0

I thought your image was not sharp enough, hence I applied the process described at How do I increase the contrast of an image in Python OpenCV to first sharpen your image and then proceed by extracting the blue layer and running the tesseract.

I hope this helps.

import cv2
import pytesseract 

img = cv2.imread('test.png') #test.png is your original image
s = 128
img = cv2.resize(img, (s,int(s/2)), 0, 0, cv2.INTER_AREA)

def apply_brightness_contrast(input_img, brightness = 0, contrast = 0):

    if brightness != 0:
        if brightness > 0:
            shadow = brightness
            highlight = 255
        else:
            shadow = 0
            highlight = 255 + brightness
        alpha_b = (highlight - shadow)/255
        gamma_b = shadow

        buf = cv2.addWeighted(input_img, alpha_b, input_img, 0, gamma_b)
    else:
        buf = input_img.copy()

    if contrast != 0:
        f = 131*(contrast + 127)/(127*(131-contrast))
        alpha_c = f
        gamma_c = 127*(1-f)

        buf = cv2.addWeighted(buf, alpha_c, buf, 0, gamma_c)

    return buf

out = apply_brightness_contrast(img,0,64)

b, g, r = cv2.split(out) #spliting and using just the blue

pytesseract.image_to_string(255-b, config='--psm 7 -c tessedit_char_whitelist=1234567890m') # the 255-b here because the image has black backgorund and white numbers, 255-b switches the colors
b3rt0
  • 769
  • 2
  • 6
  • 21
  • With opencv version 4.1 `(s, s/2)` needs to be `(s, int(s/2))`, otherwise gives `TypeError: integer argument expected, got float`. – drec4s Sep 25 '19 at 18:35
  • Thank you for pointing that out, I will edit the answer. Otherwise, does it work on your end? do you get the correct answer? – b3rt0 Sep 25 '19 at 18:55
0

Disclaimer : This is not a solution, just a trial to partially solve this.

This process works only if you have knowledge of the number of the characters present in the image beforehand. Here is the trial code :

img0 = cv2.imread('171_m.png', 0)
adap_thresh = cv2.adaptiveThreshold(img0, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
text_adth = pytesseract.image_to_string(adap_thresh, config='--psm 7')

After adaptive thresholding, the produced image is like this :

adaptive_thresholded_image

Pytesseract gives output as :

171 mi.

Now, if you know, in advance, the number of characters present, you can slice the string read by pytesseract and get the desired output as '171m'.

Arkistarvh Kltzuonstev
  • 6,824
  • 7
  • 26
  • 56