0

I want to binarize an image for OCR. I have attached the code which take image data as input and return binary image and this method works for most of the image.

For e.g,

  • Original:

Original Image Sample

  • Result:

Binarized Image of Sample

def preprocessing(image):
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blured1 = cv2.medianBlur(image, 3)
    blured2 = cv2.medianBlur(image, 51)
    divided = np.ma.divide(blured1, blured2).data
    normed = np.uint8(255 * divided / divided.max())
    th, image = cv2.threshold(normed, 100, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
    image = cv2.erode(image, np.ones((3, 3), np.uint8))
    image = cv2.dilate(image, np.ones((3, 3), np.uint8))
    return image

But when I applied the same method on below attached images it won't work as per the expectation. It should give image which has readable text for tesseract input.

  • Original Image 1:

Original Image 1

  • Pre processed image:

Pre processed image

  • Original Image 2:

Original Image 2

  • Pre processed image:

Pre processed image

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
  • You need to visualize the intermediate output images for every step in `preprocessing`. Analyze each result and come up with a general approach – Jeru Luke Jun 09 '22 at 06:57

1 Answers1

0

You should probably try to disassemble the image yourself. I think the Bradley-Roth algorithm (Bradley-Roth Adaptive Thresholding Algorithm - How do I get better performance?) could help you with a slight modification - if the neighborhood is brighter than 128, then what is darker is highlighted, if the neighborhood is darker than 128, then what is lighter is highlighted.

Lem0nti
  • 106
  • 5
  • thanks its works for my use case, I have to tweak threshold value according my input and it work pretty much good than previous logic. – Yash Mistry Jun 10 '22 at 05:36