Building a data model using kmeans algorithm and using it to classify new data set

Question

New to Data Mining and Weka. I'm writing a kmeans clustering algorithm in Java (Eclipse with Weka) which takes in a data set and builds cluster from it. How can I use these built clusters to classify further new data set?

As in, it takes a new instance and sees which cluster it belongs to using shortest distance method, without have to recreate the clusters again.

Thanks in advance.

It should be possible with `kmeans.clusterInstance(theInstance);`. (In doubt you could manually compute the centroid to which the new instance has the smallest distance, but this should not be necessary) — Marco13, Mar 25 '14 at 10:48
possible duplicate of [Assign new data point to cluster in kernel k-means (kernlab package in R)?](http://stackoverflow.com/questions/11621642/assign-new-data-point-to-cluster-in-kernel-k-means-kernlab-package-in-r) — Has QUIT--Anony-Mousse, Jul 09 '15 at 09:06

score 0 · Answer 1 · answered Mar 25 '14 at 11:25

Whay I usually do is to run weka, build the model using the UI analyzing stats and this stuff.

And then, when you get the model that you want to use in the result list, just right click on it and select Save model. This step will save the model in a .model file that you can load in your java code to classify new instances.

Here is a detailed tutorial: http://weka.wikispaces.com/Use+WEKA+in+your+Java+code

Building a data model using kmeans algorithm and using it to classify new data set

1 Answers1