Trying to segment characters and save it in order to image files. But contours are being drawn in a different order?

Question

This is the image input.

Using python opencv. I did some pre-processing and found contours using

contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)

then i did the following to save each character

img1 = cv2.imread("test26.png")
nu = 1
fin = "final"
for cnt in contours:
    x,y,w,h = cv2.boundingRect(cnt)
    img2 = img1[y:y+h, x:x+w]
    img3 = Image.fromarray(img2)
    filename = fin + str(nu) + ".png"
    nu = nu + 1
    img3.save(filename)

But characters are saved in a tree like order. I don't understand the order.

my intention is to get character by character and ocr it in order and save as text.

As stated in the answer section, use the centroid of the contour in this way you can maintain the order of your characters — Jeru Luke, Mar 24 '17 at 07:02

score 2 · Answer 1 · answered Mar 24 '17 at 06:23

2

You can try to find the location of letter by using the center of contours.

M = cv2.moments(contours)
cX = int(M["m10"] / M["m00"])
cY = int(M["m01"] / M["m00"])

Then you can find the order of characters with using cX and cY (If only one line, you use only cX)

answered Mar 24 '17 at 06:23

Ibrahim

320
2
7

Idea is good. But i am saving the letters in using name final1,final2,... – Aswani KV Mar 25 '17 at 11:00
As the lettes are being detected randomly how can I compare and find correct position – Aswani KV Mar 25 '17 at 11:01
What do you think about using any OCR tool? You can use Tesseract OCR with Python. To Set up Tesseract via homebrew (I tried on MAC OS) 1-install homebrew : ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" 2-install tesseract with homebrew : brew install tesseract 3- Python wrapper for tesseract-OCR: brew install tesseract – Ibrahim Mar 27 '17 at 06:14

score 0 · Answer 2 · answered Mar 29 '17 at 16:12

This code sorts the bounding boxes and achieves what was probably intended, does it?

import cv2
strFormula="1!((x+1)*(x+2))" # '!' means a character is not allowed in file name
img = cv2.imread("test26.png")
imgGray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
ret, imgThresh = cv2.threshold(imgGray, 127, 255, 0)

(major_ver, minor_ver, subminor_ver) = (cv2.__version__).split('.')
if int(major_ver)  < 3 :
    contours , hierarchy  = cv2.findContours(imgThresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
else :
    image, contours , _   = cv2.findContours(imgThresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
#:if

lstBoundingBoxes = []
for cnt in contours:  lstBoundingBoxes.append(cv2.boundingRect(cnt))
lstBoundingBoxes.sort()

charNo=0
for item in lstBoundingBoxes[1:]: # skip first element ('bounding box' == entire image)
    charNo += 1
    fName = "charAtPosNo-" + str(charNo).zfill(2) + "_is_[ " + strFormula[charNo-1] + " ]"+ ".png"; 
    x,y,w,h = item
    cv2.imwrite(fName, img[y:y+h, x:x+w])

Trying to segment characters and save it in order to image files. But contours are being drawn in a different order?

2 Answers2

Linked