4

I am trying to cluster a protein dna interaction dataset, and draw a heatmap using heatmap.2 from the R package gplots. My matrix is symmetrical.
Here is a copy of the data-set I am using after it is run through pearson:DataSet

Here is the complete process that I am following to generate these graphs: Generate a distance matrix using some correlation in my case pearson, then take that matrix and pass it to R and run the following code on it:

library(RColorBrewer);
library(gplots);
library(MASS);
args <- commandArgs(TRUE);
matrix_a <- read.table(args[1], sep='\t', header=T, row.names=1);
mtscaled <- as.matrix(scale(matrix_a))
# location <- args[2];
# setwd(args[2]);
pdf("result.pdf", pointsize = 15, width = 18, height = 18)
mycol <- c("blue","white","red")
my.breaks <- c(seq(-5, -.6, length.out=6),seq(-.5999999, .1, length.out=4),seq(.100009,5, length.out=7))
#colors <- colorpanel(75,"midnightblue","mediumseagreen","yellow") 
result <- heatmap.2(mtscaled, Rowv=T, scale='none', dendrogram="row", symm = T, col=bluered(16), breaks=my.breaks)
dev.off() 

The issue I am having is once I use breaks to help me control the color separation the heatmap no longer looks symmetrical.

Here is the heatmap before I use breaks, as you can see the heatmap looks symmetrical: Without Breaks

Here is the heatmap when breaks are used: With breaks

I have played with the cutoff's for the sequences to make sure for instance one sequence does not end exactly where the other begins, but I am not able to solve this problem. I would like to use the breaks to help bring out the clusters more.

Here is an example of what it should look like, this image was made using cluster maker: enter image description here

I don't expect it to look identical to that, but I would like it if my heatmap is more symmetrical and I had better definition in terms of the clusters. The image was created using the same data.

Alos
  • 2,657
  • 5
  • 35
  • 47

2 Answers2

2

After some investigating I noticed was that after running my matrix through heatmap, or heatmap.2 the values were changing, for example the interaction taken from the provided data set of

Pacdh-2 and pegg-2

gave a value of 0.0250313 before the matrix was sent to heatmap.
After that I looked at the matrix values using result$carpet and the values were then

-0.224333135 -1.09805379

for the two interactions

So then I decided to reorder the original matrix based on the dendrogram from the clustered matrix so that I was sure that the values would be the same. I used the following stack overflow question for help: Order of rows in heatmap?

Here is the code used for that:

rowInd <- rev(order.dendrogram(result$rowDendrogram))
colInd <- rowInd
data_ordered <- matrix_a[rowInd, colInd]

I then used another program "matrix2png" to draw the heatmap: enter image description here

I still have to play around with the colors but at least now the heatmap is symmetrical and clustered.

Looking into it even more the issue seems to be that I was running scale(matrix_a) when I change my code to just be mtscaled <- as.matrix(matrix_a) the result now looks symmetrical.

Community
  • 1
  • 1
Alos
  • 2,657
  • 5
  • 35
  • 47
1

I'm certainly not the person to attempt reproducing and testing this from that strange data object without code that would read it properly, but here's an idea:

  ..., col=bluered(20)[4:20], ...

Here's another though which should return the full rand of red which tha above strategy would not:

 shift.BR<- colorRamp(c("blue","white", "red"), bias=0.5 )((1:16)/16)
 heatmap.2( ...., col=rgb(shift.BR, maxColorValue=255), .... )

Or you can use this vector:

> rgb(shift.BR, maxColorValue=255)
 [1] "#1616FF" "#2D2DFF" "#4343FF" "#5A5AFF" "#7070FF" "#8787FF" "#9D9DFF" "#B4B4FF" "#CACAFF" "#E1E1FF" "#F7F7FF"
[12] "#FFD9D9" "#FFA3A3" "#FF6C6C" "#FF3636" "#FF0000"

There was a somewhat similar question (also today) that was asking for a blue to red solution for a set of values from -1 to 3 with white at the center. This it the code and output for that question:

test <- seq(-1,3, len=20)
shift.BR <- colorRamp(c("blue","white", "red"), bias=2)((1:20)/20)
tpal <- rgb(shift.BR, maxColorValue=255)
barplot(test,col = tpal)

enter image description here

(But that would seem to be the wrong direction for the bias in your situation.)

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • thanks for the suggestion when I try your solution out however I get a result similar to the result I was getting before I used the breaks. The issue there is that I loose all resolution of the clusters. I need to use breaks, or something like it so that i can control the color boundaries, but in my case I need the heatmap that is generated to look symmetrical, but instead the top right and bottom left corners of the heatmap are different color patterns. – Alos Sep 29 '12 at 02:29
  • Actually I gave two suggestions. – IRTFM Sep 29 '12 at 02:32
  • True enough my mistake, when I said sugestion I meant your anwser as a whole. I played with your other suggestions and I think it is close here is what it looks like:https://plus.google.com/photos/108505223022973601424/albums/5794041426425288545?authkey=CJbK_dTPx6OpnAE So it looks better but the top left and top right still are not there, the other issue is I am having trouble with exactly calibrating "bias=2)((1:20)/20)" so that the color range has more definition in terms of the red. Thanks – Alos Sep 30 '12 at 18:43
  • Looks to me that you want a non-monotonic color map, where it reaches it's densest red around 1 and stays dense red between 1 and 3. That should not bee too difficult with the methods illustrated. – IRTFM Sep 30 '12 at 19:27
  • Hi DWin when I stopped doing mtscaled <- as.matrix(scale(matrix_a)) and instead did just mtscaled <- as.matrix(matrix_a) the result is symmetrical. Now I can play around with the code more. Thank you for the advice +1. – Alos Oct 02 '12 at 15:01