5

I have a distance matrix as mentioned in the question here :

Clustering with a distance matrix

Now, I would like to perform DBSCAN on this matrix using the the DBSCANclusterer.java class from apache.

The method 'cluster' takes as input, a collection of points. What is the format of these points?

Referring to the above matrix, what Do i add to the collection parameter?

Can someone please paste a code snippet? I would like to specify the distance as :

A,B : 20 A,C : 20 . . .

And then when I am done with the clustering, similar samples should be clustered together.

Community
  • 1
  • 1
Nikhil
  • 1,279
  • 2
  • 23
  • 43
  • Then what do I do ? Also, can you point me to a program which takes the above matrix as input and performs DBSCAN / Hierarchial clustering? I have tried understanding the different programs. I went through cross validated and stackoverflow. But they all point to approaches. I just want a program to which I can feed the above matrix and do clustering. – Nikhil Nov 25 '13 at 13:28
  • Questions to find programs are off-topic for StackOverflow. This is a *programming* website. – Has QUIT--Anony-Mousse Nov 25 '13 at 16:04
  • So either, you google some more (there are clustering toolkits that can read external distance matrixes), or you just try implementing DBSCAN yourself, it is NOT very hard... – Has QUIT--Anony-Mousse Nov 25 '13 at 16:10
  • My questions here is specifically a *programming* question. Having said that, thank you for your input – Nikhil Nov 25 '13 at 22:36

1 Answers1

7

Hope this helps.

public class App {

public static void main(String[] args) throws FileNotFoundException, IOException {
    File[] files = getFiles("./files2/");

    DBSCANClusterer dbscan = new DBSCANClusterer(.05, 50);
    List<Cluster<DoublePoint>> cluster = dbscan.cluster(getGPS(files));

    for(Cluster<DoublePoint> c: cluster){
        System.out.println(c.getPoints().get(0));
    }                       
}

private static File[] getFiles(String args) {
    return new File(args).listFiles();
}

private static List<DoublePoint> getGPS(File[] files) throws FileNotFoundException, IOException {

    List<DoublePoint> points = new ArrayList<DoublePoint>();
    for (File f : files) {
        BufferedReader in = new BufferedReader(new FileReader(f));
        String line;

        while ((line = in.readLine()) != null) {
            try {
                double[] d = new double[2];
                d[0] = Double.parseDouble(line.split(",")[1]);
                d[1] = Double.parseDouble(line.split(",")[2]);
                points.add(new DoublePoint(d));
            } catch (ArrayIndexOutOfBoundsException e) {
            } catch(NumberFormatException e){
            }
        }
    }
    return points;
}
}

Sample Data:

12-01-99 11:31:01 AM, -40.010, -70.020
12-01-99 11:32:01 AM, -41.010, -71.020
12-01-99 11:33:01 AM, -42.010, -72.020
12-01-99 11:34:01 AM, -43.010, -73.020
12-01-99 11:35:01 AM, -40.010, -74.020

With all the files in a folder called files2 with the location declared in the getFiles method.

Dan Ciborowski - MSFT
  • 6,807
  • 10
  • 53
  • 88