14

I have some code in a Shiny app that produces the first plot below. As you can see the font size varies with the size of the correlation coefficient. I would like to produce something similar with ggpairs (GGally) or ggplot2. The second image below was produced with the following code:

library(GGally)
ggpairs(df, 
  upper = list(params = c(size = 10)),
  lower = list(continuous = "smooth", params = c(method = "loess", fill = "blue"))
)

As you can see the size of the correlation font is adjustable using size but when I set a vector of sizes only the first value is used. I would also like to remove 'Corr:' and add an indicator of significance. Using colors for the sign of the correlation coefficient would also be nice. In lower, method and fill are not linked to smooth. Any suggestions on how to get the 2nd plot to capture more features of the 1st would be great.

Anscombe's data:

df <- structure(list(y1 = c(8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 
4.26, 10.84, 4.82, 5.68), x1 = c(10L, 8L, 13L, 9L, 11L, 14L, 
6L, 4L, 12L, 7L, 5L), y2 = c(9.14, 8.14, 8.74, 8.77, 9.26, 8.1, 
6.13, 3.1, 9.13, 7.26, 4.74), x2 = c(10L, 8L, 13L, 9L, 11L, 14L, 
6L, 4L, 12L, 7L, 5L), y3 = c(7.46, 6.77, 12.74, 7.11, 7.81, 8.84, 
6.08, 5.39, 8.15, 6.42, 5.73), x3 = c(10L, 8L, 13L, 9L, 11L, 
14L, 6L, 4L, 12L, 7L, 5L)), .Names = c("y1", "x1", "y2", "x2", 
"y3", "x3"), class = "data.frame", row.names = c(NA, -11L))

correlation plot using pairs

# based mostly on http://gallery.r-enthusiasts.com/RGraphGallery.php?graph=137
panel.plot <- function(x, y) {
    usr <- par("usr"); on.exit(par(usr))
    par(usr = c(0, 1, 0, 1))
    ct <- cor.test(x,y)
    sig <- symnum(ct$p.value, corr = FALSE, na = FALSE,
                  cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1),
                  symbols = c("***", "**", "*", ".", " "))
    r <- ct$estimate
    rt <- format(r, digits=2)[1]
    cex <- 0.5/strwidth(rt)

    text(.5, .5, rt, cex=cex * abs(r))
    text(.8, .8, sig, cex=cex, col='blue')
}
panel.smooth <- function (x, y) {
      points(x, y)
      abline(lm(y~x), col="red")
      lines(stats::lowess(y~x), col="blue")
}
pairs(df, lower.panel=panel.smooth, upper.panel=panel.plot)

correlation plot using ggpairs

zx8754
  • 52,746
  • 12
  • 114
  • 209
Vincent
  • 5,063
  • 3
  • 28
  • 39

1 Answers1

13

Edit for GGally 1.0.1

Since params is now deprecated, use wrap like so:

ggpairs(df[, 1:2], 
        upper = list(continuous = wrap("cor", size = 10)), 
        lower = list(continuous = "smooth"))

enter image description here

Original answer

Customization of complicated plots is not always available through parameter list. That's natural: there are way too many parameters to keep in mind. So the only reliable option is to modify the source. This is especially pleasant when the project is hosted on github.

Here's a simple modification to start with, made in a forked repo. The easiest way to update the code and produce the plot below is to copy and paste the function ggally_cor to your global environment, then override the same function in the GGally namespace:

# ggally_cor <- <...>
assignInNamespace("ggally_cor", ggally_cor, "GGally")
ggpairs(df[, 1:2], 
        upper = list(params = c(size = 10)), 
        lower = list(continuous = "smooth"))

enter image description here

I removed the text label and added significance indicators. Modifying colour and size is not that easy, though, since these are mapped earlier. I'm still thinking on it, but you get the idea and may move on with your further customizations.

Edit: I've updated the code, see my latest commit. It now maps size of the label to the absolute value of the correlation. You can do similar thing if you want different colour, though I think this is probably a not very good idea.

enter image description here

Community
  • 1
  • 1
tonytonov
  • 25,060
  • 16
  • 82
  • 98
  • Very interesting tonytonov. I will take a closer look tomorrow. If I understand correctly the plot I am looking for is not (currently) possible with ggpairs as-is? I also want to take a closer look at @Ista's [answer](http://stackoverflow.com/questions/21691302/how-to-produce-a-meaningful-draftsman-correlation-plot-for-discrete-values/21691950#21691950) to a related question to see if a facet-based approach might be less complicated. What do you think tonytonov? – Vincent Feb 13 '14 at 07:46
  • I'm pretty sure that variable size and colour is possible. More than that, it will take no more than 10 lines of code, probably less. The only thing that needs to be done is to override the existing mapping. I'm not yet sure how to accomplish this, but I'm interested to find out. Meanwhile, a faceted approach is surely worth a try: it may indeed be a more suitable path. – tonytonov Feb 13 '14 at 07:54
  • I asked a question [here](http://stackoverflow.com/questions/21748598/add-or-override-aes-in-the-existing-mapping-object), this is what we need to do the job. – tonytonov Feb 13 '14 at 08:18
  • you are a few steps ahead of me. Are you suggesting doing this in 'regular' ggplot or adapting mappings for ggpairs. A simple example I can build on would be great. – Vincent Feb 13 '14 at 16:28
  • I will be experimenting a bit, I'll let you know how it goes. – tonytonov Feb 13 '14 at 16:30
  • Check out the answer to the question I posted; it shows how to override aes. – tonytonov Feb 17 '14 at 08:32
  • How would you apply this mapping to ggpairs? – Vincent Feb 18 '14 at 17:53
  • I understand that in **GGally 1.0.1** the `wrap` function should be used? – Konrad Feb 16 '16 at 11:52
  • 1
    @Konrad Thanks for the suggestion. I'll take a look when I can, maybe the recent update will require some changes to the solution I proposed. – tonytonov Feb 16 '16 at 14:07
  • @tonytonov Works fine with wrap and I've tested that. I'm not clear here, however, how to change the exact correlation used. I am using in the ggpairs context like you show above. Any ideas? – boshek Feb 16 '16 at 22:11
  • @boshek Thanks, I edited my answer. Not sure I understand your question though. – tonytonov Feb 17 '16 at 08:18
  • @tonytonov Sorry for being obtuse. I meant what type of correlation. Spearman? Pearson? So how is it possible to do that when specifying the call like this `ggpairs(df[, 1:2], upper = list(params = c(size = 10)), lower = list(continuous = "smooth"))` – boshek Feb 17 '16 at 17:45