2

Possible Duplicate:
clustering and matlab

What kind of data/format should matlabs clustering toolbox use? I downloaded the kdd 1999 data set it came as a data.protected file opening the file with ms text editor i was able to see the data which looks like this:

0,tcp,http,SF,239,486,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,8,8,0.00,0.00,0.00,0.00,1.00,0.00,0.00,19,19,1.00,0.00,0.05,0.00,0.00,0.00,0.00,0.00,normal.

What I did then was opened excel dragged and dropped the text file into excel, it then began to populate excel fine but it was all contained in one cell (the above format) so I went to data - text to columns and used the comma delimited that then gave me 38 columns from the kdd set to play with I then deleted the columns with text data (tcp, http, sf, normal etc) leaving only the numeric data.

I then used this method in matlab to convert the kdd.csv file to a matlab.dat file:

a = csvread('kdd.csv');
save 'kdd.dat' a -ASCII

This allowed me to use the kdd data in the clustering tool for matlab, but the output isnt as expected?

This is what it gives me:

enter image description here

I see a lot of people talking about changing it to numeric values (the values contained are numeric but maybe not what im thinking) Ive also seen alot of talk on floating points etc but im totally stuck on how to move forward how on earth do you make any intelligible sense of the data with matlabs clustering toolbox http://www.mathworks.co.uk/help/techdoc/ref/textscan.html

Community
  • 1
  • 1
G Gr
  • 6,030
  • 20
  • 91
  • 184
  • 1
    please refrain from posting multiple questions all asking similar things (you previously posted about how to read the data, also asked about the clustering results).. – Amro Oct 11 '11 at 05:07

0 Answers0