Proper image thresholding to prepare it for OCR in python using opencv

Question

I am really new to opencv and a beginner to python.

I have this image:

I want to somehow apply proper thresholding to keep nothing but the 6 digits.

The bigger picture is that I intend to try to perform manual OCR to the image for each digit separately, using the k-nearest neighbours algorithm on a per digit level (kNearest.findNearest)

The problem is that I cannot clean up the digits sufficiently, especially the '7' digit which has this blue-ish watermark passing through it.

The steps I have tried so far are the following:

I am reading the image from disk

# IMREAD_UNCHANGED is -1
image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED)

Then I'm keeping only the blue channel to get rid of the blue watermark around digit '7', effectively converting it to a single channel image

image = image[:,:,0] 
# openned with -1 which means as is, 
# so the blue channel is the first in BGR

Then I'm multiplying it a bit to increase contrast between the digits and the background:

image = cv2.multiply(image, 1.5)

Finally I perform Binary+Otsu thresholding:

_,thressed1 = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

As you can see the end result is pretty good except for the digit '7' which has kept a lot of noise.

How to improve the end result? Please supply the image example result where possible, it is better to understand than just code snippets alone.

"so the red channel is the first in RGB".... OpenCV [`imread`](https://docs.opencv.org/3.4.3/d4/da8/group__imgcodecs.html#ga288b8b3da0892bd651fce07b3bbd3a56) uses BGR order by default, so you're actually using the blue channel, not red. — Dan Mašek, Sep 22 '18 at 18:52
You are right, using the blue channel only seems to get dir of the blue watermark more effectively. I have corrected the comment in code. Thank you — Nikolas Kazepis, Sep 22 '18 at 19:19

Kinght 金 · Accepted Answer · 2018-09-23T09:32:36.200

10

You can try to medianBlur the gray(blur) image with different kernels(such as 3, 51), divide the blured results, and threshold it. Something like this：

#!/usr/bin/python3
# 2018/09/23 17:29 (CST) 
# (中秋节快乐)
# (Happy Mid-Autumn Festival)

import cv2 
import numpy as np 

fname = "color.png"
bgray = cv2.imread(fname)[...,0]

blured1 = cv2.medianBlur(bgray,3)
blured2 = cv2.medianBlur(bgray,51)
divided = np.ma.divide(blured1, blured2).data
normed = np.uint8(255*divided/divided.max())
th, threshed = cv2.threshold(normed, 100, 255, cv2.THRESH_OTSU)

dst = np.vstack((bgray, blured1, blured2, normed, threshed)) 
cv2.imwrite("dst.png", dst)

The result：

edited Sep 23 '18 at 09:32

answered Sep 23 '18 at 05:09

Kinght 金

17,681
4
60
74

1

Excellent results and simple method. +1 – alkasm Sep 23 '18 at 08:45
@Silencer could you please elaborate a bit on what do you mean by "divide the blurred results" ? Maybe a code snippet? I am pretty new to this and I have yet to get a solid grasp on opencv concepts. You divide the blurred result with the original image? You divide more than two images at a time? How? Sorry for the dumb questions. Thank you in advance! – Nikolas Kazepis Sep 23 '18 at 09:15
Silencer that indeed did the trick even for some harder cases. Thank you very much! I have been trying (for months, on and off) a hell of a lot different approaches such as Bradley Roth thresholding with no success or with worse end results, way worse than yours. So, this is a sincere thank you, you are the MVP! Kudos to @Yves too because from what I can see he is suggesting the same approach but Silencer, sir, you took the time to provide better results and a code snippet. For all these, I am grateful! +1 ! – Nikolas Kazepis Sep 23 '18 at 18:14

score 1 · Answer 2 · answered Sep 22 '18 at 20:14

1

Why not just keep values in the image that are above a certain threshold?

Like this:

import cv2
import numpy as np

img = cv2.imread("./a.png")[:,:,0]  # the last readable image

new_img = []
for line in img:
    new_img.append(np.array(list(map(lambda x: 0 if x < 100 else 255, line))))

new_img = np.array(list(map(lambda x: np.array(x), new_img)))

cv2.imwrite("./b.png", new_img)

Looks great:

You could probably play with the threshold even more and get better results.

answered Sep 22 '18 at 20:14

Plutoberth

709
7
19

If the solution need to be automatic (as often), this can't be used. And there are many situations such that a global threshold doesn't work. – Sep 22 '18 at 20:50
That's a good point. However, I checked, and even multiplying the image times 2 (again), produces a rather favorable result: https://toast-for.life/iEm9O.png – Plutoberth Sep 22 '18 at 20:56
If the solution need to be automatic (as often), this can't be used. And there are many situations such that a global threshold doesn't work. – Sep 22 '18 at 21:01
Well, what do you think would be a better idea, then? – Plutoberth Sep 22 '18 at 21:03
@Yves Daoust right, yours is definitely the better solution. I should probably only stick to threads I'm 100% confident about. – Plutoberth Sep 22 '18 at 21:26
@Plutoberth unfortunately this needs to be an automated procedure and playing with the threshold while it makes things better for some images, it make things a lot worse for some others. Maybe I can post a couple more examples later tonight to see what I mean. Thank you anyway! – Nikolas Kazepis Sep 23 '18 at 09:19

score 1 · Answer 3 · 2018-09-22T21:15:12.800

1

It doesn't seem easy to completely remove the annoying stamp.

What you can do is flattening the background intensity by

computing a lowpass image (Gaussian filter, morphological closing); the filter size should be a little larger than the character size;
dividing the original image by the lowpass image.

Then you can use Otsu.

As you see, the result isn't perfect.

edited Sep 22 '18 at 21:15

answered Sep 22 '18 at 21:05

Could you please provide a code snippet on how to generate the lowpass image and then on how to divide? Is this answer related to the above one by @Silencer? – Nikolas Kazepis Sep 23 '18 at 09:17

SilverMonkey · Answer 4 · 2018-09-22T23:03:11.847

0

I tried a slightly different approach then Yves on the blue channel:

Apply median filter (r=2):

Use Edge detection (e.g. Sobel operator):

Automatic thresholding (Otsu)

Closing of the image

This approach seems to make the output a little less noisy. However, one has to address the holes in the numbers. This can be done by detecting black contours which are completely surrounded by white pixels and simply filling them with white.

edited Sep 22 '18 at 23:03

answered Sep 22 '18 at 22:41

SilverMonkey

1,003
7
16

Well this might work but seems a bit more complicated than the above solutions so maybe I will try this one last. Code snippets would be appreciated. Thank you! – Nikolas Kazepis Sep 23 '18 at 09:20
The Sobel operator and Otsu thresholding might be replaced by the Canny edge detection – Slawomir Orlowski Sep 24 '18 at 09:55

Proper image thresholding to prepare it for OCR in python using opencv

4 Answers4