7

I have a dendrogram in R. It is based on hierachical clustering using hclust. I am colouring labels that are different in different colours, but when I try changing the labels of my dedrogram (to the rows of the dataframe the cluster is based on) using dendrogram = dendrogram %>% set("labels", dataframe$column) the labels are replaced, but in the wrong positions. As example:

My dendrogram looks like this:

 ___|___
|      _|_
|     |   | 
|     1   0
2

when I now try changing the labels like specified above, the labels are changed, but they are applied from left to right in their order in the dataframe. If we assume my original dataframe looks like this

df:
   Column1  Column2
0     1        A
1     2        B
2     3        C

what I want to have is this:

    ___|___
   |      _|_
   |     |   | 
   |     B   A
   C

But what I actually get is:

    ___|___
   |      _|_
   |     |   | 
   |     B   C
   A   

the clustering of the data and their transformation into dendrogram was done as follows:

> d <- stringdistmatrix(df$Column1, df$Column1)
> cl <- hclust(as.dist(d))
> dend = as.dendrogram(cl)

Can anybody tell me how I can label my dendrogram with the values of another column based on the index?

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
sequence_hard
  • 5,115
  • 10
  • 30
  • 50

2 Answers2

6

The dendextend package allows you to directly update dendrograms (as well as hclust), by using the following:

x <- c(1:5)
dend <- as.dendrogram(hclust(dist(x)))

if(!require(dendextend)) install.packages("dendextend")
library("dendextend")

labels(dend)
labels(dend) <- c(21:25)
labels(dend)
Tal Galili
  • 24,605
  • 44
  • 129
  • 187
  • 1
    I've been struggling with this. When I try to assign string labels (from a column in a df), it doesn't seem to work. ```Warning message: In `labels<-.dendrogram`(`*tmp*`, value = list(name = c(4L, 9L, : The lengths of the new labels is shorter than the number of leaves in the dendrogram - labels are recycled.``` – Evan Zamir May 19 '18 at 22:33
  • Can yoh provide an example of data and code that will reproduce this issue? – Tal Galili May 20 '18 at 03:19
5

In the hclust object you've created, cl, you have an element named "order" that contains the order in which the elements are in the dendrogram.

If you want to change the labels, you need to put the new labels in the same order (cl$order), so the "new" dendrogram is right:

df$column2[cl$order]
Cath
  • 23,906
  • 5
  • 52
  • 86
  • Seems to work. I can't say for sure as now some of my labels are cut off from the dendrogram (as they are 'longer' strings) when I use plot(dend). Any idea what I could do to correct that? Thanks for your answer :-) – sequence_hard Nov 09 '15 at 14:43
  • 1
    @sequence_hard you can try to reduce `cex` or enlarge the margin (or a bit of both ;-) ) – Cath Nov 09 '15 at 14:45