Overlaying values as heatmap on matrix scatter plot in R

Question

I have imported 9 columns of data using read.csv function calling the object myData and plotted it using the pairs function as below.

  pairs(myData[,c(1:9)], 
  lower.panel = panel.smooth,
  diag.panel = NULL,
  pch=1, cex= 1,  
  cex.labels = 1,
  cex.axis = 1,
  gap = 0.35, 
  font.labels = NULL,
  col="Black")

What I was hoping is to put the persons correlation of the different plots as a heat map on the matrix scatter plot as the individual scatter plot background colour. The function required for calculating the pearson correlation is below

cor(myData, method = "pearson")

This function gives the numbers I need (to construct a heat map) but I have no idea how to colour individual plots in the lower.panel argument based on persons value generated.

Please provide some sample data, e.g. using `dput`. You can also try the `corrgram` function in the `library(corrgram)` package. You can display both, a scatter plot as well as the p.values as a heatmap in one plot. Alternativly you can try the ggpairs function from the packages `ggplot2` and `GGally`: `ggpairs(iris, columns = 1:4, aes(color=Species))` — Roman, Oct 13 '16 at 14:22

score 1 · Accepted Answer · edited May 23 '17 at 10:30

I think I have an answer that should work moderately well, using base graphics. Whether it's better than the corrgram alternative mentioned in the comments I'm not sure about but... It leans heavily on a couple other posts such as this question on adding colour to panel plot background and the answer to this question on obtaining a colour gradient.

# Sample data to work with
data(iris)

# create a custom panel function that colours panels based on bg (taken from the first 
# linked question. I've just added a line to add a loess smoother, as per your code

mypanel <- function(x, y, ...){
  count <<- count+1
  bg <- color[count]
  ll <- par("usr")
  rect(ll[1], ll[3], ll[2], ll[4], col=bg)
  lines(lowess(x, y), col = "red")
  points(x, y, cex=0.5)
}

# get the values for the pearson correlations
corres <- cor(iris[1:4], method = "pearson")[lower.tri(cor(iris[1:4])) == T]

# create a colour ramp between two colours, for as many values as you have panels.
colfunc <- colorRampPalette(c("gray90", "gray20"))
color <- colfunc(length(corres))

# reorder that colour vector based on the rank of the correlation values
# (so the "highest" colour matches the highest value etc.)
color <- color[order(corres)]

# counter used in panel function
count <- 0

# plot the pairs plot using "mypanel" on lower.panel rather than panel.smooth
pairs(iris[,c(1:4)], 
      lower.panel = mypanel,
      diag.panel = NULL,
      pch=1, cex= 1,  
      cex.labels = 1,
      cex.axis = 1,
      gap = 0.35, 
      font.labels = NULL,
      col="Black")

That results in this plot. Fiddling around with the colours in colourRampPalette should hopefully be enough to give you what you want.

Hope that's useful.

May I ask how I could add each of the pearson values as a text/label to each of the plots? Lets say top left corner of each plot (in the lower panel). — Behzad Rowshanravan, Oct 13 '16 at 21:19
I would normally just add them to the top right diagonal panel using the panel.cor function described in the pairs function helpfile. Otherwise you could perhaps add the text to each plot in the mypanel function, although you would have to specify x and y locations. This might be easier with Jimbou's ggpairs solution though (which I personally prefer). — Adam Kimberley, Oct 14 '16 at 09:24
I noticed if you add this line panel.pearson.correlation.coefficient(x, y, digits = 3, prefix = "", cex.cor, ...) to the mypanel function and execute the script then you will notice that the pearson numbers do not match the colours of the panel so panels with low pearson coefficient have a darker background than panels with higher pearson coefficient. Is there a fix for this? Many thanks in advance. — Behzad Rowshanravan, Jun 23 '17 at 16:58

score 1 · Answer 2 · answered Oct 13 '16 at 15:37

You can try ggpairs. There it is relatively easy to change the backgroud color. The idea is to plot the data as the pairs() function does. Then create a heatmap color code accordingly to the pearson coefficients and finally change the background.

library(ggplot2)
library(GGally)
# iris as testdata

# The plot with smooth lines and points in the upper panel. 
p <- ggpairs(iris[-5], upper=list(continuous="points"), lower=list(continuous="smooth_loess"), diag=list(continuous="barDiag"))

# Create a heatmap color map
# correlations
pr <- cor(iris[-5])
# set breaks
breaks <-  seq(-1,1.0,0.01)
# binning
pr_b <- .bincode(pr, breaks, include.lowest = T)
# transform the pearsons in colors using redblue() palette
pr_b <- matrix(factor(pr_b, levels = 1:length(breaks), labels = rev(redblue(length(breaks)))), p$nrow)
pr
             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length    1.0000000  -0.1175698    0.8717538   0.8179411
Sepal.Width    -0.1175698   1.0000000   -0.4284401  -0.3661259
Petal.Length    0.8717538  -0.4284401    1.0000000   0.9628654
Petal.Width     0.8179411  -0.3661259    0.9628654   1.0000000
pr_b
     [,1]      [,2]      [,3]      [,4]     
[1,] "#FF0303" "#E0E0FF" "#FF2121" "#FF3030"
[2,] "#E0E0FF" "#FF0303" "#9191FF" "#A1A1FF"
[3,] "#FF2121" "#9191FF" "#FF0303" "#FF0A0A"
[4,] "#FF3030" "#A1A1FF" "#FF0A0A" "#FF0303"

# Update the background color using a for loop. The diagonal slots are overwritten by an empty plot
for(i in 1:p$nrow) {
  for(j in 1:p$ncol){
    p[i,j] <- p[i,j] + 
      theme(panel.background= element_rect(fill=pr_b[i,j]))
    if(i == j){
      p[i,j] <-ggplot()+ annotate("text",5,5,label=colnames(iris)[i]) + theme_void()
  }
}}

# The plot
p

score 1 · Answer 3 · answered Mar 28 '17 at 17:18

This is easy to do with the 'corrgram' package, which comes with 'panel.pts' and 'panel.shade' functions. I merged these two functions together into a function called 'panel.shadepoints' and defined a color ramp with lighter colors so that the points could still be seen.

panel.shadepoints <- function(x, y, corr=NULL, col.regions, cor.method, ...){

  # If corr not given, try to calculate it
  if(is.null(corr)) {
    if(sum(complete.cases(x,y)) < 2) {
      warning("Need at least 2 complete cases for cor()")
      return()
    } else {
      corr <- cor(x, y, use='pair', method=cor.method)
    }
  }

  ncol <- 14
  pal <- col.regions(ncol)
  col.ind <- as.numeric(cut(corr, breaks=seq(from=-1, to=1, length=ncol+1),
                            include.lowest=TRUE))
  usr <- par("usr")
  # Solid fill
  rect(usr[1], usr[3], usr[2], usr[4], col=pal[col.ind], border=NA)

  # Overlay points
  plot.xy(xy.coords(x, y), type="p", ...)

  # Boounding box needs to plot on top of the shading, so do it last.
  box(col='lightgray')
}

data(iris)
redblue<-colorRampPalette(c("pink","gray90","skyblue"))
corrgram(iris, panel=panel.shadepoints, col=redblue)

Overlaying values as heatmap on matrix scatter plot in R

3 Answers3