1

When I use R to perform k-means clustering data. I can’t know how to find appropriate k value. I see the Elbow method, but I don't how to use kclus <- kmeans(data, centers = k); Can anyone help me to find define Elbow methods to find k-value ? Thanks

2 Answers2

2

You can use Elbow method as the snippet below:

elbow.k <- function(mydata){
  ## determine a "good" k using elbow
  dist.obj <- dist(mydata);
  hclust.obj <- hclust(dist.obj);
  css.obj <- css.hclust(dist.obj,hclust.obj);
  elbow.obj <- elbow.batch(css.obj);
  #   print(elbow.obj)
  k <- elbow.obj$k
  return(k)
}

It can find appropriate k value. But it time consuming, you should use parallel package to reduce the time.

VanThaoNguyen
  • 792
  • 9
  • 20
2

I have this used this particular example, here f is mydata

library("clusters")
library("fpc")
findClusters <- function(f) {
    asw <- numeric(20)
    for (k in 2:20)
      asw[[k]] <- pam(f,k)$silinfo$avg.width

    k.best <- which.max(asw)
    cl <- kmeans(f,k.best)
    return(unlist(round(cl$centers,3),recursive = FALSE))
}

I had taken this from this particular link

Community
  • 1
  • 1
ArunK
  • 1,731
  • 16
  • 35