14

I have a situation in which i am plotting a dendrogram with data points that come with class labels. I wish to see that agglomerative clustering groups those with the same label together. Color coding the labels makes it easy to read such a dendrogram. Is there a way we can achieve this with ggdendro in R ?

chet
  • 419
  • 6
  • 15

1 Answers1

23

Stealing most of the setup from this post ...

library(ggplot2)
library(ggdendro)
data(mtcars)
x <- as.matrix(scale(mtcars))
dd.row <- as.dendrogram(hclust(dist(t(x))))
ddata_x <- dendro_data(dd.row)

p2 <- ggplot(segment(ddata_x)) +
  geom_segment(aes(x=x, y=y, xend=xend, yend=yend))

... and adding a grouping factor ...

labs <- label(ddata_x)
labs$group <- c(rep("Clust1", 5), rep("Clust2", 2), rep("Clust3", 4))
labs
#     x y text  group
# 1   1 0 carb Clust1
# 2   2 0   wt Clust1
# 3   3 0   hp Clust1
# 4   4 0  cyl Clust1
# 5   5 0 disp Clust1
# 6   6 0 qsec Clust2
# 7   7 0   vs Clust2
# 8   8 0  mpg Clust3
# 9   9 0 drat Clust3
# 10 10 0   am Clust3
# 11 11 0 gear Clust3

... you can use the aes(colour=) argument to geom_text() to color your labels:

p2 + geom_text(data=label(ddata_x),
               aes(label=label, x=x, y=0, colour=labs$group))

enter image description here

(If you want to supply your own colors, you can use scale_colour_manual(), doing something like this:

p2 + geom_text(data=label(ddata_x),
               aes(label=label, x=x, y=0, colour=labs$group)) +
     scale_colour_manual(values=c("blue", "orange", "darkgreen"))
Community
  • 1
  • 1
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • Thank you! I am relatively new to R, this helps me a lot. – chet Nov 09 '11 at 16:32
  • 1
    Running your code, I get 2 errors: 1) in `geom_segment`, it can't find `x0` but this is simply fixed by changing the arguments in `x=x, y=y, xend=xend, yend=yend`; 2) in `geom_text`, it says: `Don't know how to automatically pick scale for object of type function. Defaulting to continuous Error in data.frame(x = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11), y = 0, label = function (x, : arguments imply differing number of rows: 11, 1, 0`. How can I fix it? Because I have a similar case to deal with (i.e. coloring labels) and provides the same error. – Davide Passaretti Nov 10 '14 at 10:41
  • @DavidePassaretti -- With help from Andrie de Vries (**ggdendro**'s author) and Roland (another SO regular), I've edited the code so that it works with the current version of **ggdendro**. Thanks for the heads up that this answer was no longer functional! – Josh O'Brien Nov 11 '14 at 17:52
  • Is there a way to color the leaves instead of coloring the labels? Thanks – Mdhale May 24 '17 at 20:25