-2

I have a collection of Points objects, containing latitude and longitude (along with a few other irrelevant properties). I want to form clusters i.e. collections of points that are close together, relative to other points.

Alternatively, I would like an algorithm which, if given a list of clusters containing close-by points and a new point, determines which cluster the new point belongs to (and adds it to a new cluster if it doesn't belong to an existing cluster).

I looked at Hierarchical Clustering algorithms but those run too slow. The k-means algorithm requires you to know the number of clusters beforehand, whcih is not really very helpful.

Thanks!

  • possible duplicate of [Clustering Algorithm for Mapping Application](http://stackoverflow.com/questions/73927/clustering-algorithm-for-mapping-application) – Ed Smith May 08 '15 at 16:21
  • I guess you have to know the name of clusters beforehand, run the algorithm several times and then pick the best `N` for number of Clusters. Also, take a look at Gaussian Mixture Models. That's an alternative to `k-means`. Finally, if you definetly do not want to define the number of clusters beforehand, a final alternative would be to use `community detection` algorithm in graphs (But you would have to represent your data as a graph first). – rafaelc May 08 '15 at 16:31

1 Answers1

1

Try density based clustering methods. DBSCAN is one of the most popular of those.

I am assuming you are using python. Check out this:

http://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html

http://scikit-learn.org/stable/auto_examples/cluster/plot_dbscan.html

When you cluster based on GPS lat/lon, you may want to use a different distance calculation method than DBSCAN's default. Use its metric parameter to use your own distance calculation function or distance matrix. For distance calculations check out Haversine Formula.

burhan
  • 924
  • 4
  • 11