2

I want to color branches of a dendrogram by specific groups that a defined in a dataframe.

library(reshape2)
library(factoextra) # clustering visualization 
library(dendextend)
#iris dataset
#defining colors
colori = rep(NA, length=length(iris$Species))
colori[which(iris$Species=="setosa")] = "red"
colori[which(iris$Species=="versicolor")] = "blue"
colori[which(iris$Species=="virginica")] = "yellow"

iris_dist <- dist(iris[ ,1:4],)
hc1_iris <- hclust(iris_dist,method = "average")
col_dendro_iris <- color_branches(as.dendrogram(hc1_iris),groupLabels =T, clusters = iris$Species,col=colori)

col_dendro_iris_plot <- plot(col_dendro_iris,main = "Dendrogram of normalized BLS\ncolored by manmade groups",labels = NULL,xlab = NULL)

That only colors the branches red. Why? How can I solve that enter image description here

EDIT: It works when I do this

pca_iris <- PCA(iris[ ,1:4])
colori = rep(NA, length=length(iris$Species))
colori[which(iris$Species=="versicolor")] = "red"
colori[which(iris$Species=="virginica")] = "yellow"
colori[which(iris$Species=="setosa")] = "blue"
# species <- iris$Species
iris_gr <- cbind(iris,colori)
# 
pca_iris <- fviz_pca_ind(pca_iris,
             pointshape = 21,habillage = iris$Species,
             geom.ind = c("point"),geom = c("point"),palette = iris$colori,
             title="PCA of normalized BLS\ncolored by manmade groups")
pca_iris<- pca_iris + theme(legend.position = "upper.right")

Just for future readers. But actually I can't color the dendrogram in an analog way. I do not have a k or h element for defining clusters. Like in iris, I have predefined clusters I want to color.

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
takeITeasy
  • 350
  • 3
  • 19
  • Does this answer your question? [How to create a dendrogram with colored branches?](https://stackoverflow.com/questions/18036094/how-to-create-a-dendrogram-with-colored-branches) – UseR10085 May 20 '20 at 06:56
  • Like I edited, I do not have a ```k``` or ```h``` element for defining clusters but predefined groups. In the iris example it is the specy – takeITeasy May 20 '20 at 10:17
  • For predefined groups, I suggest using color_labels to color the text with this information.Also take a look at: http://talgalili.github.io/dendextend/reference/colored_bars.html – Tal Galili May 28 '20 at 08:53

1 Answers1

5

You should use the library dendextend. It has the functions for extending dendrogram objects.

Below a simple example.

library(dendextend)
dend_var<-as.dendrogram(hc_var)
dend_colored<-color_branches(dend_var, h=10000, k=7)
plot(dend_colored)

dend_var is a a dendrogram or hclust tree object.

k is used to choose the number of groups.

h is used to choose the height at which to cut tree.

enter image description here

Earl Mascetti
  • 1,278
  • 3
  • 16
  • 31
  • Thank you @SlowLearning. Actually I do not have an ```h```or ```k```element. My groups a re like in iris, predefined in a data.frame – takeITeasy May 20 '20 at 11:59
  • 1
    @takeITeasy When you use the dendrogram you are speaking about the hierarchical clustering. In this case you have just 4 groups, this because you have just four variables. The dendogram starts from below to above (I say this because you will find in the down part all the variables that you use in your analysis). I use the dendogram when I look for groups and I need to see the environment of my variables (every step is level). For example to choose the groups you should cut the dendrogram. – Earl Mascetti May 20 '20 at 13:55