0

I am attempting to use a Kmeans cluster analysis in R, and have run into some problems. I converted one column from factor to numeric, but now when i run this:

for(i in 2:15)wss[1] = sum(kmeans(mydata, centers = i)$withinss)

I get the following error:

Error in sample.int(m, k) : invalid first argument

I checked sapply(mydata, class) and all columns are numeric. What seems to be the problem?

I am using this webpage as a reference

Here is a VERY small sample portion of my dataset. I am working with 400 cases so I'm not sure if size of the data set has anything to do with it:

zz <- "  C      D       E      F       G "
C001   177.5   22.5   268.1   27.1    37.5
C002   262.5   71.9   278.2   22.7    87.5
C003   191.3   12.5   257.3   16.2    87.5
C004   518.9   83.1   277.5   39.3    75.0
X001   217.5   52.3   274.2   29.1    87.5
X002   407.8  147.8   335.5  112.4    87.5
X003   602.2   87.9   658.3  152.0   100.0
X004   187.8   36.7   252.5   28.6    62.5
catastrophic-failure
  • 3,759
  • 1
  • 24
  • 43
Buskea22
  • 31
  • 1
  • 6
  • 1
    Please read [How to make a great R reproducible example?](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). Without looking at some portion of your data, there's no way we can reproduce the error. – catastrophic-failure Aug 03 '16 at 15:30
  • Hopefully the edit helps a bit – Buskea22 Aug 03 '16 at 17:22
  • This error in `kmeans` occurs if the function cannot calculate the dimensions of your data correctly. More specifically, when the number of rows in your input dataset is 0. Please revise your data, and post the output of this function: `str(mydata)`. Then, we would be able to advise you better. – TWL Aug 05 '16 at 19:01

0 Answers0