5

I have a set of rectangles which I need to cluster together, based on euclidean distance between them.The situation is explained in the attached image. The situation is explained in the attached image.

One possible approach is to take the center of each rectangle and cluster the center points using K means (distance function would be euclidean distance in XY plane). However, I would like to know if there is any other approach to this problem, which does not approximate a rectangle by it's central point, but also takes the actual shape of the rectangle into consideration.

rivu
  • 2,004
  • 2
  • 29
  • 45
  • http://uclue.com/?xq=4737 This thread could be useful for You. It is about to find the shortest distance between two rectangles. –  May 24 '13 at 17:44

3 Answers3

2

Have a look at algorithms such as DBSCAN and OPTICS that can be used with arbitrary data types as long as you can define a distance between them (such as the minimum rectangle-to-rectangle distance).

K-means is probably not so good, as it is designed for point data with squared euclidean distance (= sum of squares, within-cluster-variance).

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
2

One way to formulate this problem is to look at each rectangle i, and each pair of rectangles (i,j) having distance d(i,j), and then forming a distance matrix from those. This distance measure d could be distance between rectangle centers or something more fancy like distance between closest points on rectangles.

Then, apply a clustering algorithm that takes a distance matrix as input, where you define your distance matrix D as the matrix where element (i,j) is d(i,j).

Related: Clustering with a distance matrix

Anony-Mousse's answer has some nice suggestions for algorithms you could use to cluster given the distance matrix.

Community
  • 1
  • 1
Timothy Shields
  • 75,459
  • 18
  • 120
  • 173
1

We used Spectral Clustering with left_x, right_x, top_y, bottom_y coordinates as features with pretty good results.

Jerry
  • 61
  • 4