0

I have the following image

enter image description here

lower = np.array([175, 125, 45], dtype="uint8")
upper = np.array([255, 255, 255], dtype="uint8")

mask = cv2.inRange(image, lower, upper)
img = cv2.bitwise_and(image, image, mask=mask)

plt.figure()
plt.imshow(img)
plt.axis('off')
plt.show()

enter image description here

now if I try to transform into grayscale like this:

gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

I get that:

enter image description here

And I would like to extract the number on it.

The suggestion:

gray = 255 - gray
emp = np.full_like(gray, 255)
emp -= gray
emp[emp==0] = 255
emp[emp<100] = 0
gauss = cv2.GaussianBlur(emp, (3,3), 1)
gauss[gauss<220] = 0
plt.imshow(gauss)

gives the image:

enter image description here

Then using pytesseract on any of the images:

data = pytesseract.image_to_string(img, config='outputbase digits')

gives:

'\x0c'

Another suggested solution is:

gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
thr = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV)[1]
txt = pytesseract.image_to_string(thr)
plt.imshow(thr)

enter image description here

And this gives

'\x0c'

Not very satisfying... Anyone has a better solution please?

Thanks!

Nicolas Rey
  • 431
  • 2
  • 6
  • 19
  • 1
    Add img = PIL.ImageOps.invert(img) before OCR – Alex Alex Jan 21 '21 at 17:56
  • I tried: img = Image.fromarray(img), img = ImageOps.invert(img), data = pytesseract.image_to_string(img), yields the same result... – Nicolas Rey Jan 21 '21 at 19:22
  • @NicolasRey you are doing the whole process of image processing, then you are passing the **raw** image before processing to the `pytesseract` : `data = pytesseract.image_to_string(img, config='outputbase digits')` replace `img` by `gauss`!!! – Bilal Jan 23 '21 at 12:27

2 Answers2

4

I have a two step solution


When you apply thresholding to the image:

enter image description here

Thresholding is a simplest method of displaying the features of the image.

Now from the output image, when we read:

txt = image_to_string(thr, config="--psm 7")
print(txt)

Result will be:

| 1,625 |

Now why do we set page-segmentation-mode (psm) mode to the 7?

Well, treating image as a single text line will give the accurate result.

But we have to modify the result. Since the current result is | 1,625 |

We should remove the |

print("".join([t for t in txt if t != '|']))

Result:

1,625

Code:


import cv2
from pytesseract import image_to_string

img = cv2.imread("LZ3vi.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 0, 255,
                    cv2.THRESH_BINARY_INV)[1]
txt = image_to_string(thr, config="--psm 7")
print("".join([t for t in txt if t != '|']).strip())

Update


how do you get this clean black and white image from my original image?

Using 3-steps

    1. Reading the image using opencv's imread function
    • img = cv2.imread("LZ3vi.png")
      
    • Now we read the image in BGR fashion. (Not RGB)

    1. Convert the image to the graysclae
    • gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
      
    • Result will be:

      • enter image description here
    1. Apply threshold
    • thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY_INV)[1]
      
    • Result will be:

      • enter image description here

Now if you are wondering about thresholding. Read the simple-threhsolding

All my filters, grayscale... get weird colored images

The reason is, when you are displaying the image using pyplot, you need to set color-map (cmap) to gray

plt.imshow(img, cmap='gray')

You can read the other types here

Ahmet
  • 7,527
  • 3
  • 23
  • 47
1

Two issues blocked the pytessract from detecting your number:

  1. The white rectangle around the number(Inverting and filling is the solution).
  2. The Noise in the numbers shape(Gaussian Smoothing dealt with that)

The solution that AlexAlex has proposed will work perfectly if it was followed by a Gaussian filter:

output: 1,625

import numpy as np
import pytesseract
import cv2

BGR = cv2.imread('11.png')
RGB = cv2.cvtColor(BGR, cv2.COLOR_BGR2RGB)

lower = np.array([175, 125, 45], dtype="uint8")
upper = np.array([255, 255, 255], dtype="uint8")

mask = cv2.inRange(RGB, lower, upper)
img = cv2.bitwise_and(RGB, RGB, mask=mask)

gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)

gray = 255 - gray
emp = np.full_like(gray, 255)
emp -= gray

emp[emp==0] = 255
emp[emp<100] = 0

gauss = cv2.GaussianBlur(emp, (3,3), 1)
gauss[gauss<220] = 0

text = pytesseract.image_to_string(gauss, config='outputbase digits')

print(text)

Bilal
  • 3,191
  • 4
  • 21
  • 49
  • @NicolasRey because you didn't provide the original image before, just the grayscale image! you can check the update now. – Bilal Jan 21 '21 at 20:30
  • I don't get why but I still do not manage to get your result even with the update. Do you get the same colored images I get after every filter? I have updated my result – Nicolas Rey Jan 23 '21 at 12:06
  • @NicolasRey I'm using this [image](https://i.stack.imgur.com/WOT6f.png) that you have provided as an input with this [code](https://stackoverflow.com/a/65834400), if you are using different image or you are using your own code I'm not sure what the result will be! – Bilal Jan 23 '21 at 12:29