Python opencv sorting contours

Question

I am following this question:

How can I sort contours from left to right and top to bottom?

to sort contours from left-to-right and top-to-bottom. However, my contours are found using this (OpenCV 3):

im2, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

and they are formatted like this:

   array([[[ 1,  1]],

   [[ 1, 36]],

   [[63, 36]],

   [[64, 35]],

   [[88, 35]],

   [[89, 34]],

   [[94, 34]],

   [[94,  1]]], dtype=int32)]

When I run the code

max_width = max(contours, key=lambda r: r[0] + r[2])[0]
max_height = max(contours, key=lambda r: r[3])[3]
nearest = max_height * 1.4
contours.sort(key=lambda r: (int(nearest * round(float(r[1])/nearest)) * max_width + r[0]))

I am getting the error

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

so I changed it to this:

max_width = max(contours, key=lambda r:  np.max(r[0] + r[2]))[0]
max_height = max(contours, key=lambda r:  np.max(r[3]))[3]
nearest = max_height * 1.4
contours.sort(key=lambda r: (int(nearest * round(float(r[1])/nearest)) * max_width + r[0]))

but now I am getting the error:

TypeError: only length-1 arrays can be converted to Python scalars

EDIT:

After reading the answer below I modified my code:

EDIT 2

This is the code that I use to "dilate" the characters and find the contours

kernel = cv2.getStructuringElement(cv2.MORPH_RECT,(35,35))

# dilate the image to get text
# binaryContour is just the black and white image shown below
dilation = cv2.dilate(binaryContour,kernel,iterations = 2)

END OF EDIT 2

im2, contours, hierarchy = cv2.findContours(dilation,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)

myContours = []

# Process the raw contours to get bounding rectangles
for cnt in reversed(contours):

    epsilon = 0.1*cv2.arcLength(cnt,True)
    approx = cv2.approxPolyDP(cnt,epsilon,True)

    if len(approx == 4):

        rectangle = cv2.boundingRect(cnt)
        myContours.append(rectangle)

max_width = max(myContours, key=lambda r: r[0] + r[2])[0]
max_height = max(myContours, key=lambda r: r[3])[3]
nearest = max_height * 1.4
myContours.sort(key=lambda r: (int(nearest * round(float(r[1])/nearest)) * max_width + r[0]))

i=0
for x,y,w,h in myContours:

    letter = binaryContour[y:y+h, x:x+w]
    cv2.rectangle(binaryContour,(x,y),(x+w,y+h),(255,255,255),2)
    cv2.imwrite("pictures/"+str(i)+'.png', letter) # save contour to file
    i+=1

Contours before sorting:

[(1, 1, 94, 36), (460, 223, 914, 427), (888, 722, 739, 239), (35,723, 522, 228), 
(889, 1027, 242, 417), (70, 1028, 693, 423), (1138, 1028, 567, 643),     
(781, 1030, 98, 413), (497, 1527, 303, 132), (892, 1527, 168, 130),  
(37, 1719, 592, 130), (676, 1721, 413, 129), (1181, 1723, 206, 128), 
(30, 1925, 997, 236), (1038, 1929, 170, 129), (140, 2232, 1285, 436)]

Contours after sorting:

(NOTE: This is not the order I want the contours to be sorted in. Refer to image at the bottom)

[(1, 1, 94, 36), (460, 223, 914, 427), (35, 723, 522, 228), (70,1028, 693, 423), 
(781, 1030, 98, 413), (888, 722, 739, 239), (889, 1027, 242, 417), 
(1138, 1028, 567, 643), (30, 1925, 997, 236), (37, 1719, 592, 130), 
(140, 2232, 1285, 436), (497, 1527, 303, 132), (676, 1721, 413, 129), 
(892, 1527, 168, 130), (1038, 1929, 170, 129), (1181, 1723, 206, 128)]

Image I am working with

I want to find the contours in the following order:

Dilation image used for finding contours

Can you explain your goal? What do you need in the final output? Upon which basis you want to find the contours on area, on origin position or some other criteria ? — ZdaR, Sep 12 '16 at 06:24
I have uploaded another image to describe what order I want the contours to be sorted as. I just to need to sort the contours by position and save them to file in that order. — , Sep 12 '16 at 07:39

ZdaR · Answer 1 · 2016-09-12T09:25:55.093

37

What you actually need is to devise a formula to convert your contour information to a rank and use that rank to sort the contours, Since you need to sort the contours from top to Bottom and left to right so your formula must involve the origin of a given contour to calculate its rank. For example we can use this simple method:

def get_contour_precedence(contour, cols):
    origin = cv2.boundingRect(contour)
    return origin[1] * cols + origin[0]

It gives a rank to each contour depending upon the origin of contour. It varies largely when two consecutive contours lie vertically but varies marginally when contours are stacked horizontally. So in this way, First the contours would be grouped from Top to Bottom and in case of Clash the less variant value among the horizontal laid contours would be used.

import cv2

def get_contour_precedence(contour, cols):
    tolerance_factor = 10
    origin = cv2.boundingRect(contour)
    return ((origin[1] // tolerance_factor) * tolerance_factor) * cols + origin[0]

img = cv2.imread("/Users/anmoluppal/Downloads/9VayB.png", 0)

_, img = cv2.threshold(img, 70, 255, cv2.THRESH_BINARY)

im, contours, h = cv2.findContours(img.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

contours.sort(key=lambda x:get_contour_precedence(x, img.shape[1]))

# For debugging purposes.
for i in xrange(len(contours)):
    img = cv2.putText(img, str(i), cv2.boundingRect(contours[i])[:2], cv2.FONT_HERSHEY_COMPLEX, 1, [125])

If you see closely, the third row where 3, 4, 5, 6 contours are placed the 6 comes between 3 and 5, The reason is that the 6th contour is slightly below the line of 3, 4, 5 contours.

Tell me is you want the output in other way around we can tweak the get_contour_precedence to get 3, 4, 5, 6 ranks of contour corrected.

edited Sep 12 '16 at 09:25

answered Sep 12 '16 at 08:10

ZdaR

22,343
7
66
87

Yes, I need the contours to be in 3,4,5,6. The sorting has to be able to "ignore" the slightly higher contour and still sort in that order. Would it also help to do some preprocessing so that contours found in the same row are of the same height? – Sep 12 '16 at 08:22
Add a tolerance value to address your needs . – ZdaR Sep 12 '16 at 09:26
could you please explain what 'cols' is in here? – Raksha Dec 19 '18 at 00:10
Is it the number of pixels in the row of an image? I did something like that, where if you have boxes at (2,4) and (3,1) and the image is 100 pixels long, you'd make their positions 2*100 + 4 = 204 and 3*100+1 = 301, so they will always be in order from left to right (supposedly), but what if the coordinates you start with aren't all on the same line? Like you have four boxes that are all in a line but the second one is just a bit lower. Now that box will be placed out of order. I guess that's where to tolerance factor comes in? – Raksha Dec 19 '18 at 00:17
Yes, Yes you get it, I am sorry if my answer is unclear on the use of `tolerance`. I will update it accordingly. I am glad that it helped you @Raksha – ZdaR Dec 19 '18 at 05:49
Amazing stuff! how do you change the order> currently it sorting top->bottom,left->right. How do you inverse to right->left?? – Shlomi Hassid Jan 09 '21 at 23:28
Hi @ShlomiHassid, we can update `get_contour_precedence` to use `cols - origin[1]` instead of origin[1]`. Does this makes sense? – ZdaR Jan 11 '21 at 07:52
Hi, I know im little late to this but I have a question and this is so far the best answer. I'm trying to develop an app to detect handwritten local language. word order is critical in it. when I sort the contour the order is incorrect. some words are little below the others in sentence and order for those are incorrect. I need to order horizontally. I'm new to python and all these so please bear with me. I played with tolerance_factor and haven't much luck. – wajira000 Jul 02 '21 at 02:45
Hi @wajira000, your problem seems interesting to me. post a new question and update the link here with some sample images to look upon ? – ZdaR Jul 02 '21 at 03:28
Hi @ZdaR, Thanks for the reply. I posted it here https://stackoverflow.com/q/68220867/2435046. Please advice. – wajira000 Jul 02 '21 at 07:00

score 5 · Answer 2 · answered Jan 30 '19 at 10:02

This is from Adrian Rosebrock for sorting contours based on location link:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2


def sort_contours(cnts, method="left-to-right"):
    # initialize the reverse flag and sort index
    reverse = False
    i = 0

    # handle if we need to sort in reverse
    if method == "right-to-left" or method == "bottom-to-top":
        reverse = True

    # handle if we are sorting against the y-coordinate rather than
    # the x-coordinate of the bounding box
    if method == "top-to-bottom" or method == "bottom-to-top":
        i = 1

    # construct the list of bounding boxes and sort them from top to
    # bottom
    boundingBoxes = [cv2.boundingRect(c) for c in cnts]
    (cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
        key=lambda b:b[1][i], reverse=reverse))

    # return the list of sorted contours and bounding boxes
    return (cnts, boundingBoxes)

def draw_contour(image, c, i):
    # compute the center of the contour area and draw a circle
    # representing the center
    M = cv2.moments(c)
    cX = int(M["m10"] / M["m00"])
    cY = int(M["m01"] / M["m00"])

    # draw the countour number on the image
    cv2.putText(image, "#{}".format(i + 1), (cX - 20, cY), cv2.FONT_HERSHEY_SIMPLEX,
        1.0, (255, 255, 255), 2)

    # return the image with the contour number drawn on it
    return image

# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="Path to the input image")
ap.add_argument("-m", "--method", required=True, help="Sorting method")
args = vars(ap.parse_args())

# load the image and initialize the accumulated edge image
image = cv2.imread(args["image"])
accumEdged = np.zeros(image.shape[:2], dtype="uint8")

# loop over the blue, green, and red channels, respectively
for chan in cv2.split(image):
    # blur the channel, extract edges from it, and accumulate the set
    # of edges for the image
    chan = cv2.medianBlur(chan, 11)
    edged = cv2.Canny(chan, 50, 200)
    accumEdged = cv2.bitwise_or(accumEdged, edged)

# show the accumulated edge map
cv2.imshow("Edge Map", accumEdged)

# find contours in the accumulated image, keeping only the largest
# ones
cnts = cv2.findContours(accumEdged.copy(), cv2.RETR_EXTERNAL,
    cv2.CHAIN_APPROX_SIMPLE)
cnts = imutils.grab_contours(cnts)
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)[:5]
orig = image.copy()

# loop over the (unsorted) contours and draw them
for (i, c) in enumerate(cnts):
    orig = draw_contour(orig, c, i)

# show the original, unsorted contour image
cv2.imshow("Unsorted", orig)

# sort the contours according to the provided method
(cnts, boundingBoxes) = sort_contours(cnts, method=args["method"])

# loop over the (now sorted) contours and draw them
for (i, c) in enumerate(cnts):
    draw_contour(image, c, i)

# show the output image
cv2.imshow("Sorted", image)
cv2.waitKey(0)

score -1 · Answer 3 · edited May 23 '17 at 12:02

It appears the question you linked works not with the raw contours but first obtains a bounding rectangle using cv2.boundingRect. Only then does it make sense to calculate max_width and max_height. The code you posted suggests that you are trying to sort the raw contours, not bounding rectangles. If that is not the case, can you provide a more complete piece of your code, including a list of multiple contours that you are trying to sort?

score -1 · Answer 4 · answered Nov 11 '21 at 18:10

You can simply check the distance for contours and rank them, hear an example

def get_distance(x,y):
    return math.sqrt(x*x+y*y)


img = cv2.imread("/image.png", 0)

res, img = cv2.threshold(img, 70, 255, cv2.THRESH_BINARY)

contours, h = cv2.findContours(img.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_A

for i in xrange(len(contours)):
    [x, y, w, h] = cv2.boundingRect(contours[i])
    img = cv2.putText(img, str(get_distance(x,y)), 
    cv2.boundingRect(contours[i])[:2], cv2.FONT_HERSHEY_COMPLEX, 1, [125])

Python opencv sorting contours

4 Answers4

Linked