3

From a dataframe data.main, I am able to generate a hclust dendrogram as,

aa1<- c(2,4,6,8)
bb1<- c(1,3,7,11)
aa2<-c(3,6,9,12)
bb2<-c(3,5,7,9)
data.main<- data.frame(aa1,bb1,aa2,bb2)
d1<-dist(t(data.main))
hcl1<- hclust(d1)
plot(hcl1)

Further, I know there are ways to use a tree cutoff to color the branches or leaves. However, is it possible to color them based on partial column names or column number (e.g. I want that branch corresponding to aa1, aa2 be red and bb1 and bb2 be blue)?

I have checked the R package dendextend but am still not able to find a direct/easy way to get the desired result.

dendrogram with <code>aa2</code> and <code>bb2</code> clustered most closely. Then <code>bb1</code> is next closest, followed by <code>aa1</code>. The labels and branches are colored based on the label. Those starting with "aa" are red and those starting with "bb" are blue.

Tal Galili
  • 24,605
  • 44
  • 129
  • 187
Polar.Ice
  • 138
  • 2
  • 12
  • Please include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data and describe what you would like the output to look like for that specific data. This will make it much easier to help you. – MrFlick May 05 '15 at 20:36
  • i have edited the question and hope that it is more clear now. – Polar.Ice May 06 '15 at 06:42
  • @MrFlick, sorry for the confusion. In earlier edit, although i mentioned that "I want that branch corresponding to aa1, aa2 be red and bb1 and bb2 be blue" i didn't provide right figure. – Polar.Ice May 06 '15 at 18:15

3 Answers3

3

It's easier to change colors for a dendrogram than an hclust object, but it's pretty straightforward to convert. You can do

drg1 <- dendrapply(as.dendrogram(hcl1, hang=.1), function(n){
  if(is.leaf(n)){
    labelCol <- c(a="red", b="blue")[substr(attr(n,"label"),1,1)];
    attr(n, "nodePar") <- list(pch = NA, lab.col = labelCol);
    attr(n, "edgePar") <- list(col = labelCol); # to color branch as well
  }
  n;
});
plot(drg1)

which will draw

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295
0

UPDATE

I'm only leaving my answer because it is valid and someone might find OOMPA useful. However, after seeing the solution of using dendrapply as suggested by MrFlick, I recommend it instead. You might find other features of the OOMPA package useful, but I wouldn't install it just for functionality that already exists in core R.


Original Answer

Install OOMPA (Object-Oriented Microarray and Proteomics Analysis package):

source("http://silicovore.com/OOMPA/oompaLite.R")
oompaLite()

Then use the plotColoredClusters function from the library ClassDiscovery:

library(ClassDiscovery)
aa1<- c(2,4,6,8)
bb1<- c(1,3,7,11)
aa2<-c(3,6,9,12)
bb2<-c(3,5,7,9)
data.main<- data.frame(aa1,bb1,aa2,bb2)
d1<-dist(t(data.main))
hcl1<- hclust(d1)

#identify the labels
labels=hcl1[4]$labels

# Choose which ones are in the "aa" group
aa_present <- grepl("aa", labels)

colors <- ifelse(aa_present, "red", "blue")

plotColoredClusters(hcl1,labs=labels,cols=colors)

Result:

Cluster diagram with aa2 and aa1 both colored red while bb1 and bb2 are colored blue

Community
  • 1
  • 1
Christopher Bottoms
  • 11,218
  • 8
  • 50
  • 99
  • Thanks for your reply. However, when I tried installing OOMPA there seems to have few problems. 1. possibly the OOMPA server is down. _Warning in install.packages : cannot open: HTTP status was '404 Not Found' Warning in install.packages : unable to access index for repository http://www.rforge.net/bin/macosx/mavericks/contrib/3.1 package_ 2. ‘oompaBase’ is not available (for R version 3.1.2). Further, my priority is to get colored branches (not just labels). This is because I have about 500 labels which would overlap, so I would like to skip labels and just se the branches. – Polar.Ice May 06 '15 at 17:56
  • If you are interested in using OOMPA, you might want to try later. I also have R version 3.1.2. I just install OOMPA on Linux (CentOS 6) after I saw your question. – Christopher Bottoms May 06 '15 at 18:01
  • well, I am using R on a macOS. I tried again but still have those warning. _Warning: unable to access index for repository http://silicovore.com/OOMPA/bin/macosx/mavericks/contrib/3.1_ _Warning: unable to access index for repository http://R-Forge.R-project.org/bin/macosx/mavericks/contrib/3.1_ _Warning: unable to access index for repository http://www.rforge.net/bin/macosx/mavericks/contrib/3.1_ _Warning message: packages ‘oompaBase’, ‘oompaData’, ‘PreProcess’, ‘ClassDiscovery’, ‘ClassComparison’ are not available (for R version 3.1.2)_ ps. I am sure that my internet speed great! – Polar.Ice May 06 '15 at 18:11
  • For the present question, I recommend doing as MrFlick suggested in his answer and use the `dendrapply` function (which comes with R already). – Christopher Bottoms May 06 '15 at 18:18
  • If you really want to install OOMPA and cannot do it, then I would recommend posting that problem as a separate question. – Christopher Bottoms May 06 '15 at 18:18
0

ice, the dendextend package allows to do this using the assign_values_to_leaves_edgePar function.

Here is how to use it:

aa1 <- c(2,4,6,8)
bb1 <- c(1,3,7,11)
aa2 <- c(3,6,9,12)
bb2 <- c(3,5,7,9)
data.main <- data.frame(aa1,bb1,aa2,bb2)
d1 <- dist(t(data.main))
hcl1 <- hclust(d1)
# plot(hcl1)

dend <- as.dendrogram(hcl1)
col_aa_red <- ifelse(grepl("aa", labels(dend)), "red", "blue")
dend2 <- assign_values_to_leaves_edgePar(dend=dend, value = col_aa_red, edgePar = "col")
plot(dend2)

Result:

enter image description here

Tal Galili
  • 24,605
  • 44
  • 129
  • 187