How to binarize an image when image has white text on black background and vice versa?

Question

I want to binarize an image for OCR. I have attached the code which take image data as input and return binary image and this method works for most of the image.

For e.g,

Original:

Original Image Sample

Result:

Binarized Image of Sample

def preprocessing(image):
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blured1 = cv2.medianBlur(image, 3)
    blured2 = cv2.medianBlur(image, 51)
    divided = np.ma.divide(blured1, blured2).data
    normed = np.uint8(255 * divided / divided.max())
    th, image = cv2.threshold(normed, 100, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
    image = cv2.erode(image, np.ones((3, 3), np.uint8))
    image = cv2.dilate(image, np.ones((3, 3), np.uint8))
    return image

But when I applied the same method on below attached images it won't work as per the expectation. It should give image which has readable text for tesseract input.

Original Image 1:

Original Image 1

Pre processed image:

Pre processed image

Original Image 2:

Original Image 2

Pre processed image:

Pre processed image

You need to visualize the intermediate output images for every step in `preprocessing`. Analyze each result and come up with a general approach — Jeru Luke, Jun 09 '22 at 06:57

score 0 · Accepted Answer · answered Jun 09 '22 at 06:32

0

You should probably try to disassemble the image yourself. I think the Bradley-Roth algorithm (Bradley-Roth Adaptive Thresholding Algorithm - How do I get better performance?) could help you with a slight modification - if the neighborhood is brighter than 128, then what is darker is highlighted, if the neighborhood is darker than 128, then what is lighter is highlighted.

answered Jun 09 '22 at 06:32

Lem0nti

106
5

thanks its works for my use case, I have to tweak threshold value according my input and it work pretty much good than previous logic. – Yash Mistry Jun 10 '22 at 05:36

How to binarize an image when image has white text on black background and vice versa?

1 Answers1