-1

I want to use clusgap to estimate the number of clusters needed for a given data set. The problem is i cannot get the k value from clusgap although this library is recommended for the gap statistic.

Below is how im using clusgap:

hcluster = clusGap(dataMatrix,  FUN = hcut, nstart = 25, K.max = 100, B = 50)
kcluster = clusGap(dataMatrix, kmeans, K.max=100, B=50)

The following is clusgaps output and i can see that the recommended number of cluster is 11 but i cannot access this number dynamically.

Clustering Gap statistic ["clusGap"] from call:
clusGap(x = dataMatrix, FUNcluster = hcut, K.max = 100, B = 50,     nstart = 25)
B=50 simulated reference sets, k = 1..100; spaceH0="scaledPCA"
 --> Number of clusters (method 'firstSEmax', SE.factor=1): 11
           logW    E.logW      gap      SE.sim
  [1,] 8.995981 10.000102 1.004121 0.004184801
  [2,] 8.694404  9.716407 1.022003 0.017857009
  [3,] 8.538334  9.616808 1.078473 0.008792356
  [4,] 8.466726  9.574631 1.107905 0.005905742
  [5,] 8.363253  9.550745 1.187492 0.004978537
  [6,] 8.303085  9.531952 1.228867 0.004084501
  [7,] 8.270890  9.516404 1.245514 0.004118244
  [8,] 8.241259  9.502743 1.261484 0.004018474
  [9,] 8.220926  9.490543 1.269617 0.003874152

Any help would be much appreciated.

blobbymatt
  • 317
  • 1
  • 2
  • 17
  • Please review how to provide a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example); in its current form your question is not reproducible. Please edit your question to include sample data for `dataMatrix` (e.g. using `dput`). – Maurits Evers Apr 14 '19 at 08:45
  • Thank you for your input, but the link you provided is in relation to fixing an error. Since i don't have one and i was looking for more of an explanation of process, i didn't feel a reproducible example was required. – blobbymatt Apr 14 '19 at 10:12
  • It is very difficult for anyone to provide help (regardless of whether this is in relation to an error, a warning or explanation of a particular non base R function) if you don't provide sample data. An [MCVE](https://stackoverflow.com/help/mcve) is *always* useful! Either way, you've answered your own question which is perfectly fine and the question can be closed. – Maurits Evers Apr 14 '19 at 10:31

1 Answers1

1

Incase anyone comes across this, here is how i did it:

hcluster = clusGap(dataMatrix,  FUN = hcut, nstart = 25, K.max = 100, B = 50)    
k <- maxSE(hcluster$Tab[, "gap"], hcluster$Tab[, "SE.sim"], method="Tibs2001SEmax")
blobbymatt
  • 317
  • 1
  • 2
  • 17