0

I'm new at R and I'm still learning. However, I cannot seem to wrap my head around this.

I've done kmeans clustering and then I wanted to confirm the optimal number of clusters using the Nbcluster library. However, when I do my analysis with kmeans I find out that 4 would be the optional number of clusters according to between SS / total SS.

Also I ran this fviz_nbclust(ClusterTiktok, kmeans, method = "was") and this is what I Get enter image description here

kmm1 = kmeans(ClusterTiktok,3,nstart = 25,iter.max = 15)
kmm
kmm1 

The between SS/total SS value here rapidly increases from 2-3 and 3-4 and starts decreasing at 5

But then when I ran the nbcluster algorithm it says the number of optimal clusters is 2

    library(NbClust)
hist(nb$Best.nc[1,], breaks = max(na.omit(nb$Best.nc[1,])))
nc <- NbClust(ClusterTiktok, min.nc=2, max.nc=10, method="kmeans")

barplot(table(nc$Best.n[1,]),
        xlab="Numer of Clusters", ylab="Number of Criteria",
        main="Number of Clusters Chosen by 26 Criteria")
table(nc$Best.n[1,])
enter code here

Would somebody be generous enough to explain to me which method is more credible or what I'm doing wrong?

Stef2nn
  • 49
  • 7
  • If you want advice on improving clustering performance or interpreting results from statistical models, you should ask for help at [stats.se]. This really isn't a specific programming question that's appropriate for Stack Overflow. This is a statistical matter. – MrFlick May 26 '22 at 14:43
  • Is there a way I can surpass this 40minute break I got after posting up this question? – Stef2nn May 26 '22 at 14:53

0 Answers0