13

My question is twofold;

I have a ggpairs plot with the default upper = list(continuous = cor) and I would like to colour the tiles by correlation values (exactly like what ggcorr does).

I have this: ggpairs plot of daily flows
I would like the correlation values of the plot above to be coloured like this: ggcorr heatmap of correlation values

library(GGally)

sample_df <- data.frame(replicate(7,sample(0:5000,100)))
colnames(sample_df) <- c("KUM", "MHP", "WEB", "OSH", "JAC", "WSW", "gaugings")

ggpairs(sample_df, lower = list(continuous = "smooth"))  
ggcorr(sample_df, label = TRUE, label_round = 2)

I had a brief go at trying to use upper = list(continuous = wrap(ggcorr) but didn't have any luck and, given that both functions return plot calls, I don't think that's the right path?

I am aware that I could build this in ggplot (e.g. Sandy Muspratt's solution) but given that the GGally package already has the functionality I am looking for I thought I might be overlooking something.


More broadly, I would like to know how we, or if we can, call the correlation values? A simpler option may be to colour the labels rather than the tile (i.e. this question using colour rather than size) but I need a variable to assign to colour...

Being able to call the correlation values to use in other plots would be handy although I suppose I could just recalculate them myself.

Thank you!

eipi10
  • 91,525
  • 24
  • 209
  • 285
MadiN
  • 195
  • 4
  • 10

2 Answers2

11

A possible solution is to get the list of colors from the ggcorr correlation matrix plot and to set these colors as background in the upper tiles of the ggpairs matrix of plots.

library(GGally)   
library(mvtnorm)
# Generate data
set.seed(1)
n <- 100
p <- 7
A <- matrix(runif(p^2)*2-1, ncol=p) 
Sigma <- cov2cor(t(A) %*% A)
sample_df <- data.frame(rmvnorm(n, mean=rep(0,p), sigma=Sigma))
colnames(sample_df) <- c("KUM", "MHP", "WEB", "OSH", "JAC", "WSW", "gaugings")

# Matrix of plots
p1 <- ggpairs(sample_df, lower = list(continuous = "smooth"))  
# Correlation matrix plot
p2 <- ggcorr(sample_df, label = TRUE, label_round = 2)

The correlation matrix plot is:

enter image description here

# Get list of colors from the correlation matrix plot
library(ggplot2)
g2 <- ggplotGrob(p2)
colors <- g2$grobs[[6]]$children[[3]]$gp$fill

# Change background color to tiles in the upper triangular matrix of plots 
idx <- 1
for (k1 in 1:(p-1)) {
  for (k2 in (k1+1):p) {
    plt <- getPlot(p1,k1,k2) +
     theme(panel.background = element_rect(fill = colors[idx], color="white"),
           panel.grid.major = element_line(color=colors[idx]))
    p1 <- putPlot(p1,plt,k1,k2)
    idx <- idx+1
}
}
print(p1)

enter image description here

Marco Sandri
  • 23,289
  • 7
  • 54
  • 58
10

You can map a background colour to the cell by writing a quick custom function that can be passed directly to ggpairs. This involves calculating the correlation between the pairs of variables, and then matching to some user specified colour range.

my_fn <- function(data, mapping, method="p", use="pairwise", ...){

              # grab data
              x <- eval_data_col(data, mapping$x)
              y <- eval_data_col(data, mapping$y)

              # calculate correlation
              corr <- cor(x, y, method=method, use=use)

              # calculate colour based on correlation value
              # Here I have set a correlation of minus one to blue, 
              # zero to white, and one to red 
              # Change this to suit: possibly extend to add as an argument of `my_fn`
              colFn <- colorRampPalette(c("blue", "white", "red"), interpolate ='spline')
              fill <- colFn(100)[findInterval(corr, seq(-1, 1, length=100))]

              ggally_cor(data = data, mapping = mapping, ...) + 
                theme_void() +
                theme(panel.background = element_rect(fill=fill))
            }

Using the data in Marco's answer:

library(GGally)    # version: ‘1.4.0’

p1 <- ggpairs(sample_df, 
                   upper = list(continuous = my_fn),
                   lower = list(continuous = "smooth"))  

Which gives:

enter image description here


A followup question Change axis labels of a modified ggpairs plot (heatmap of correlation) noted that post plot updating of the theme resulted in the panel.background colours being removed. This can be fixed by removing the theme_void and removing the grid lines within the theme. i.e. change the relevant bit to (NOTE that this fix is not required for ggplot2 v3.3.0)

ggally_cor(data = data, mapping = mapping, ...) + 
           theme(panel.background = element_rect(fill=fill, colour=NA),
                 panel.grid.major = element_blank()) 
user20650
  • 24,654
  • 5
  • 56
  • 91
  • is there any possibility to change the x-axis labels with your solution? https://stackoverflow.com/questions/60930523/change-axis-labels-of-a-modified-ggpairs-plot-heatmap-of-correlation – ava Mar 30 '20 at 12:43
  • @user20650: if I add a theme with `+ theme_minimal()` to `p1`, the `panel.background` colors are again removed. Any advice where to add the theme such that it is not eventually removed? Many thanks! – mavericks Jun 04 '20 at 08:24
  • Hi @mavericks; can you explain your expected outcome please as i'd expect theme_minimal to strip the panel colour (i.e. `p=ggplot()+theme(panel.background = element_rect(fill="red")); p ; p +theme_minimal()`) . But yes it can be / is a bit of a pain setting theme elements across ggpairs. – user20650 Jun 04 '20 at 09:05
  • 1
    @mavericks; there will be a way to do this more simply but you can piece the parts together to get the final effect (which I think you want). Add a `theme_minimal` to the lower & diagonal plots and remove the `strip.background` . So try ; `ggpairs(sample_df, upper = list(continuous = my_fn), lower = list(continuous = function(...) ggally_points(...)+theme_minimal()), diag = list(continuous = function(...) ggally_densityDiag(...)+theme_minimal()))+ theme(strip.background = element_blank())` – user20650 Jun 04 '20 at 09:35
  • @user20650 thanks a lot for your input! Just added [this SO question](https://stackoverflow.com/questions/62196950/ggpairs-plot-with-heatmap-of-correlation-values-with-significance-stars-and-cust/62196951#62196951) summarizing your previous advice which greatly helped me with my plots! Thanks again – mavericks Jun 04 '20 at 14:20