How to create faceted linear regression plot using GGPLOT

Question

I have a data frame created the following way.

library(ggplot2)

x <- data.frame(letters[1:10],abs(rnorm(10)),abs(rnorm(10)),type="x")
y <- data.frame(letters[1:10],abs(rnorm(10)),abs(rnorm(10)),type="y")
 # in reality the number of row could be larger than 10 for each x and y

all <- rbind(x,y)
colnames(all) <- c("name","val1","val2","type")

What I want to do is to create a faceted ggplot that looks roughly like this:

enter image description here

Hence each facet above is the correlation plot of the following:

# Top left facet
subset(all,type=="x")$val1 
subset(all,type=="y")$val1

# Top right facet
subset(all,type=="x")$val1 
subset(all,type=="y")$val2

# ...etc..

But I'm stuck with the following code:

p <- ggplot(all, aes(val1, val2))+ geom_smooth(method = "lm")  + geom_point() +
facet_grid(type ~ ) 
# Calculate correlation for each group
cors <- ddply(all, c(type ~ ), summarise, cor = round(cor(val1, val2), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=0.5, y=0.5)

What's the right way to do it?

What does type have to do with your desired plot image? There is a ggAlly package with a ggpairs function that may be useful. As it stands I am struggling to see the connection between your example data and the desired plot. — mnel, Mar 18 '13 at 06:17
It is particularly confusing that you refer to mpg and wt, which are not in your data — alexwhan, Mar 18 '13 at 06:31
@neversaint just out of curiosity: how did you draw http://i.stack.imgur.com/r3wBT.jpg ? Did you use a software application? Is it https://www.fiftythree.com/paper ? — Alessandro Jacopson, Nov 19 '14 at 15:18

score 8 · Accepted Answer · edited Mar 18 '13 at 07:50

8

Some of your code was incorrect. This works for me:

p <- ggplot(all, aes(val1, val2))+ geom_smooth(method = "lm")  + geom_point() +
  facet_grid(~type) 
# Calculate correlation for each group
cors <- ddply(all, .(type), summarise, cor = round(cor(val1, val2), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=1, y=-0.25)

enter image description here

Edit: Following OP's comment and edit. The idea is to re-create the data with all four combinations and then facet.

# I consider the type in your previous data to be xx and yy
dat <- data.frame(val1 = c(rep(all$val1[all$type == "x"], 2), 
                           rep(all$val1[all$type == "y"], 2)), 
                  val2 = rep(all$val2, 2), 
                  grp1 = rep(c("x", "x", "y", "y"), each=10), 
                  grp2 = rep(c("x", "y", "x", "y"), each=10))

p <- ggplot(dat, aes(val1, val2)) + geom_point() + geom_smooth(method = "lm") + 
     facet_grid(grp1 ~ grp2)
cors <- ddply(dat, .(grp1, grp2), summarise, cor = round(cor(val1, val2), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=1, y=-0.25)

enter image description here

edited Mar 18 '13 at 07:50

Arun

116,683
26
284
387

answered Mar 18 '13 at 05:53

alexwhan

15,636
5
52
66

Not quite. It should create 2x2 grid. See *blue* font in the drawing for the combination. – neversaint Mar 18 '13 at 05:56
3

So what do you want on the diagonal? Where it says 'need not be plotted' - but it is plotted in your diagram? – alexwhan Mar 18 '13 at 05:58
I put that in the diagram just to show in each grid, which combination of values are use for correlation. – neversaint Mar 18 '13 at 05:59
1

@alexwhan, you'll have to reshape the data to get all 4 combinations of x and y, if I understand it right and have a separate group for each of them, then facet_wrap with ncol=2 perhaps? – Arun Mar 18 '13 at 07:20
Yes, I understand what's being asked now, won't be at my computer for a while - @Arun, do you feel like doing it? – alexwhan Mar 18 '13 at 07:23
1

@alexwhan, you had the answer, the question was unclear at that moment. It'd be unfair to take the answer away from you :). – Arun Mar 18 '13 at 12:32

score 4 · Answer 2 · answered Mar 18 '13 at 07:33

Since your data is not in the appropriate format, some reshaping is necessary before it can be plotted.

Firstly, reshape the data to the long format:

library(reshape2)
allM <- melt(all[-1], id.vars = "type")

Split the values along type and val1 vs. val2:

allList <- split(allM$value, interaction(allM$type, allM$variable))

Create a list of all combinations:

allComb <- unlist(lapply(c(1, 3), 
                  function(x)
                    lapply(c(2 ,4), 
                           function(y) 
                             do.call(cbind, allList[c(x, y)]))), 
           recursive = FALSE)

Create a new dataset:

allNew <- do.call(rbind, 
                  lapply(allComb, function(x) {
                                    tmp <- as.data.frame(x)
                                    tmp <- (within(tmp, {xval <- names(tmp)[1]; 
                                                         yval <- names(tmp)[2]}))
                                    names(tmp)[1:2] <- c("x", "y")
                                    tmp}))

Plot:

library(ggplot2)
p <- ggplot(allNew, aes(x = x, y = y)) + 
       geom_smooth(method = "lm")  + 
       geom_point() +
       facet_grid(yval ~ xval) 
# Calculate correlation for each group
library(plyr)
cors <- ddply(allNew, .(yval, xval), summarise, cor = round(cor(x, y), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=0.5, y=0.5)

enter image description here

score 1 · Answer 3 · answered Jul 05 '19 at 09:09

There is an additional package ggpubr available now addressing exactly this issue with the stat_cor() function.

library(tidyverse)
library(ggpubr)
ggplot(all, aes(val1, val2))+ 
  geom_smooth(method = "lm")  + 
  geom_point() +  
  facet_grid(~type) +
  stat_cor()

How to create faceted linear regression plot using GGPLOT

3 Answers3

Linked