1

I'm trying to cluster photos (GPS + timestamp) around known GPS locations.

3d points = 2d + time stamp.

For example: I walk along the road and take photos of lampposts, some are interesting and so I take 10 photos and others are not so I don't take any.

I'd like to cluster my photos around the lampposts, allowing me to see which lamppost was being photographed.

I've been looking at something like k-means clustering and wanted something intelligent than just snapping the photos to nearest lamppost.

(I'm going to write the code in javascript for a client side app handing about (2000,500) points at a time )

Mario Michelli
  • 523
  • 1
  • 5
  • 11

3 Answers3

1

KMeans Clustering is indeed a popular and easy to implement algorithm, but it has a couple of problems.

  1. You need to feed him the number of clusters N as an input variable. Now, since I assume you don't know how many "things" you want to photoigraph, finding the right N. Using Iterative KMeans or similar variations only slides the problem to finding a proper evaluation function for multicluster partitions, that's in no way easier then finding N itself.

  2. It can only detect linearly separable shapes. Let's say you are walking around Versailles, and you take a lot of pictures of the external walls. Then you move inside, and take pictures of the inside garden. The two shapes you obtain are a thorus with a disk inside it, but KMeans can't distinguish them.

Personally, I'd go with some sort of Density Based Clustering : you'll still have to feed the algorithm some variables,but, since we assume that the space will be Euclidian, finding them shouldn't take too much. Plus, it gives you the ability to distinct Noise points from Cluster points, and let you treat them differently.

Furthermore, it can distinguish between most shapes, and you don't need to give the number of cluster beforehand.

Save
  • 11,450
  • 1
  • 18
  • 23
1

Density based clustering, such as DBSCAN, definitely is the way to go.

The two parameters of DBSCAN should be quite obvious to set:

  • epsilon: this is the radius for clustering, so e.g. you could use 10 meters, assuming that there are no lampposts closer than 10 meters. (You should be using Geodetic distance, not Euclidean!)

  • minPts: essentially the minimum size of a cluster. You can use 1 or 2, even.

  • distance: this parameter is implicit, but probably more important. You can use a combination of space and time here. E.g. 10 meters spatially, and 1 year in the time domain. See Generalized DBSCAN for the more flexible version, which makes it obvious how to use multiple domains.

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
0

You can use a delaunay triangulation to look for nearest points. It gives you a nearest neighbor graph where the points is on the delaunay edges. Or you can cluster by color like in photo mosaic. It uses an anti pole tree. Here is a similar answer: Algorithm to find for all points in set A the nearest neighbor in set B

Community
  • 1
  • 1
Micromega
  • 12,486
  • 7
  • 35
  • 72
  • That would be a lot of work to figure out what points are close together. Or would this give you information other than that found by computing pairwise distances? – Teepeemm Aug 23 '13 at 16:33
  • But you would still need to translate the graph into clusters. And I'm not seeing how the graph would make that process enough easier that it would be worth the trouble of building the graph. – Teepeemm Aug 23 '13 at 16:39
  • I've started by using http://harthur.github.io/clusterfck/ I plan on using hierarchical clustering and then snap the clusters to the nearest point of interest(lamppost), I should have enough of a gap between points that they don't merge. Thanks again for all the help. – Mario Michelli Aug 25 '13 at 22:01
  • Don't think I'll need use Delaunay triangulation. – Mario Michelli Aug 26 '13 at 06:14
  • @Mario: It's also subdivide the plane. It can be easier to insert or remove points? Can you accept an answer? – Micromega Aug 26 '13 at 07:00