Given a set of bounding boxes in an 2D space, group them into rows

Question

Given a set of N bounding boxes with vertex coordinates:

"vertices": [
    {
      "y": 486, 
      "x": 336
    }, 
    {
      "y": 486, 
      "x": 2235
    }, 
    {
      "y": 3393, 
      "x": 2235
    }, 
    {
      "y": 3393, 
      "x": 336
    }
  ]

I would like to group the bounding boxes into rows. In other words, given the pictorial representation of bounding boxes in this image:

I would like an algorithm that returns:

[1,2,3]
[4,5,6]
[7,8]

[Edit: Clarification] The grouping decisions (e.g. [4,5,6] and [7,8]) should be based on some kind of error minimisation such as least squares.

Is there an algorithm or library (preferably in python) that does this?

You can draw a horizontal line under (1,2,3) that doesn't cut through any boxes, so (1,2,3) are seperate. However, you can't split (4,5,6,7,8) in that way. So how do you define the choice of (4,5,6) (7,8) and not e.g. (5,6) (4,8) (7) ? — m69's been on strike for years, Aug 10 '16 at 03:48
Ah I should have clarified - I framed the problem that way so that the algorithm would have to be use some kind of least-squares error optimisation (or an alternative). The bounding boxes won't always be clearly separable. — amex, Aug 10 '16 at 04:10

score 0 · Accepted Answer · edited May 23 '17 at 10:28

0

I think this is a clustering problem. In fact, because you can ignore the x co-ordinates, I think this is a 1-dimensional clustering problem. Some standard clustering algorithms such as k-means are good for minimising sums of squares from cluster centres, which amounts to what you are looking for. Unfortunately, they don't guarantee to find the globally best solution. One-dimensional clustering is a special case for which there are exact algorithms - see Cluster one-dimensional data optimally?.

edited May 23 '17 at 10:28

Community

1
1

answered Aug 10 '16 at 04:28

mcdowella

19,301
2
19
25

Wouldn't 1D clustering lump (4,5,6,7,8) together? – m69's been on strike for years Aug 10 '16 at 04:47
1

With clustering you usually tell it the number of different clusters to divide the points into so whether it groups any set of points together will depend on the number of clusters you ask for. Alternatively you can create a penalty that rises with the number of clusters created and find the solution that gives the best total of penalty and sum of squared deviations. Then you try and find a penalty system that gets you the sort of divisions you want. – mcdowella Aug 10 '16 at 16:56
I think that in this particular problem, ignoring the x-coordinate would create worse results; in the example, the fact that (4,5,6) (7,8) is a better split than (5,6) (4,7,8) only becomes apparent if you know that 4 and 7 overlap horizontally. (Unless the penalty system can re-introduce the x-coordinate.) – m69's been on strike for years Aug 10 '16 at 19:13
mmm, ok clustering with a penalty function does seem like the correct classification for this problem. There are some nuances to the dimensionality and how to pick the x/y coordinate that best represents each box, i.e. should it be just an average of the coordinates (regardless of 1-D or 2-D)? but alright, I'll accept this answer, thanks! – amex Aug 13 '16 at 15:09
Thanks. I didn't quote a sample penalty function because clustering books typically don't commit themselves, but one option to try comes from https://en.wikipedia.org/wiki/Akaike_information_criterion#Comparison_with_least_squares: AIC = 2k + n ln(RSS). Here k is the number of clusters (since you have one parameter per cluster - the x co-ordinate) n is the number of data points, and RSS is the sum of squared errors, or sum of squared deviations from the nearest cluster centre. So you could take AIC as the resulting penalty function. – mcdowella Aug 14 '16 at 04:21

Given a set of bounding boxes in an 2D space, group them into rows

1 Answers1