0

I am using the factoextra library in R to work on K-means clustering. I am able to create my PCA plot showing clustering membership of the data points but I wish to shape my data points using the time variable. I have pasted my dummy code below, it seems that the fviz_cluster fails to recognize the 'Time' variable.

I'd appreciate all help and comments.

k2 <- kmeans(Scaled_data, centers = 2, nstart = 25)
k2$Time <- as.factor(time)
print(names(k2))
print(length(k2$Time))
print(length(k2$cluster))

plot_Obj <- fviz_cluster(k2, data = Scaled_data,
         stand = FALSE,
         ellipse.type = "norm",
         geom = "point",
         alpha=0.5,
         ggtheme = theme_minimal(),
         repel = FALSE,
         shape=Time)
print(plot_Obj)

Output: 
      [1] "cluster"      "centers"      "totss"        "withinss"     
      "tot.withinss"  [6] "betweenss"    "size" "iter"  "ifault"    "Time"  

  [1] 783
  [1] 783

 Error: 
    Error in fviz_cluster(k2, data = Scaled_data, stand = FALSE, 
    ellipse.type = "norm",  : object 'Time' not found
     Execution halted
Mdhale
  • 815
  • 2
  • 15
  • 22
  • Please, make a reproducible example. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – bbiasi May 19 '19 at 00:46

1 Answers1

0

With scale_shape_manual().

library(factoextra)
set.seed(123)
data("iris")

iris.scaled <- scale(iris[, -5])
km.res <- kmeans(iris.scaled, 3, nstart = 10)

km.res$cluster
shapex <- data.frame(clust = km.res$cluster) %>% 
  dplyr::mutate(shape = ifelse(clust == 1, 21,
                               ifelse(clust == 2, 22,
                                      ifelse(clust == 3, 23, "ERROR"))))

p <- fviz_cluster(km.res, iris[, -5], ellipse.type = "norm")
p

enter image description here

p + scale_shape_manual(values = 10:12)

enter image description here

Note that the number of points is equal to the number of clusters. The shapes available are:

enter image description here

bbiasi
  • 1,549
  • 2
  • 15
  • 31