0

I am creating dendrograms using ggdendro and coloring them according to cutpoints in the branches. I'm using the approach provided by @jlhoward in this question (Colorize Clusters in Dendogram with ggplot2) but I run into problems when my leaf labels are very long.

Here is some example code:

df   <- USArrests 
labs <- paste("veryverylongtitlename",1:50,sep="")
rownames(df) <- labs

library(ggplot2)
library(ggdendro)
hc       <- hclust(dist(df), "ave")           # heirarchal clustering
dendr    <- dendro_data(hc, type="rectangle") # convert for ggplot
clust    <- cutree(hc,k=2)                    # find 2 clusters
clust.df <- data.frame(label=names(clust), cluster=factor(clust))
# dendr[["labels"]] has the labels, merge with clust.df based on 
label column
dendr[["labels"]] <- merge(dendr[["labels"]],clust.df, by="label")
# plot the dendrogram; note use of color=cluster in geom_text(...)
ggplot() + 
  geom_segment(data=segment(dendr), aes(x=x, y=y, xend=xend, 
yend=yend)) + 
  geom_text(data=label(dendr), aes(x, y, label=label, hjust=0, color=cluster), 
       size=3) +
  coord_flip() + scale_y_reverse(expand=c(0.2, 0)) + 
  theme(axis.line.y=element_blank(),
    axis.ticks.y=element_blank(),
    axis.text.y=element_blank(),
    axis.title.y=element_blank(),
    panel.background=element_rect(fill="white"),
    panel.grid=element_blank())

enter image description here

As you can see, the labels here get cut off. I found this answer (decrease size of dendogram (or y-axis) ggplot), but I don't want to use it because I very much like the ability to use cutree to define my clusters. How can I manipulate the above code to fit the long labels? Many thanks!

viralrna
  • 73
  • 4
  • What if you did `scale_y_reverse(limits=c(160, -50))` to control the range of the `y-axis` rather than using `expand=` and just include as much space as you need. – MrFlick Jan 23 '19 at 21:03
  • Thanks for your answer. That does work, but is there any method that avoids having the expanded y axis tick labels? Having the -50 on the graph isn't ideal. – viralrna Jan 23 '19 at 21:08
  • 1
    You can place labels where ever you like and make them say whatever you like with with `scale_y_reverse` with the `breaks=` and `labels=` parameters. Try just adding `breaks=c(150,100,50,0)` – MrFlick Jan 23 '19 at 21:09
  • Ah yes, that does the trick. I appreciate the help! – viralrna Jan 23 '19 at 21:12
  • Can you please add a your solution as an answer so that the question no longer appears unanswered? – erc Jan 24 '19 at 08:00

0 Answers0