4

I have an image

enter image description here

from where I want to extract each and every character individually.

As i want something like THIS OUTPUT and so on.

What would be the appropriate approach to do this using OpenCV and python?

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
Bits
  • 179
  • 1
  • 2
  • 7

2 Answers2

8

A short addition to Amitay's awesome answer. You should negate the image using

cv2.THRESH_BINARY_INV

to capture black letters on white paper.

Another idea could be the MSER blob detector like that:

img = cv2.imread('path to image')
(h, w) = img.shape[:2]
image_size = h*w
mser = cv2.MSER_create()
mser.setMaxArea(image_size/2)
mser.setMinArea(10)

gray = cv2.cvtColor(filtered, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
_, bw = cv2.threshold(gray, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

regions, rects = mser.detectRegions(bw)

# With the rects you can e.g. crop the letters
for (x, y, w, h) in rects:
    cv2.rectangle(img, (x, y), (x+w, y+h), color=(255, 0, 255), thickness=1)

This also leads to a full letter recognition.

enter image description here

Zoe
  • 27,060
  • 21
  • 118
  • 148
crazzle
  • 271
  • 3
  • 12
  • Your idea is great and it is working in most cases, but sometimes it detects two characters as one. Do you know a way to optimize it to get a perfect character segmentation? – t2t Nov 19 '19 at 10:21
  • Despite tweaking the MSER parameters, you can use dilate + erode to increase the gap (use it on a mask and crop from the original image afterwards). Sorry for the late reply though. – crazzle Feb 17 '20 at 08:17
  • In very difficult scenarios (a lot of noice in the image) this is not working very well with cv2. I'm going to build my own model to separate the chars. – t2t Feb 17 '20 at 16:31
1

You can do the following ( opencv 3.0 and aboove)

  1. Run Otsu thresholding on the image (http://docs.opencv.org/3.2.0/d7/d4d/tutorial_py_thresholding.html)
  2. Run connected component labeling with stats on the threshold images.(How to use openCV's connected components with stats in python?)
  3. For each connected component take the bounding box using the stat you got from step 2 which has for each one of the comoneonts the follwing information (cv2.CC_STAT_LEFT cv2.CC_STAT_TOP cv2.CC_STAT_WIDTH cv2.CC_STAT_HEIGHT)
  4. Using the bounding box crop the component from the original image.
Community
  • 1
  • 1
Amitay Nachmani
  • 3,259
  • 1
  • 18
  • 21