5

I have a data frame created the following way.

library(ggplot2)

x <- data.frame(letters[1:10],abs(rnorm(10)),abs(rnorm(10)),type="x")
y <- data.frame(letters[1:10],abs(rnorm(10)),abs(rnorm(10)),type="y")
 # in reality the number of row could be larger than 10 for each x and y

all <- rbind(x,y)
colnames(all) <- c("name","val1","val2","type")

What I want to do is to create a faceted ggplot that looks roughly like this:

enter image description here

Hence each facet above is the correlation plot of the following:

# Top left facet
subset(all,type=="x")$val1 
subset(all,type=="y")$val1

# Top right facet
subset(all,type=="x")$val1 
subset(all,type=="y")$val2

# ...etc..

But I'm stuck with the following code:

p <- ggplot(all, aes(val1, val2))+ geom_smooth(method = "lm")  + geom_point() +
facet_grid(type ~ ) 
# Calculate correlation for each group
cors <- ddply(all, c(type ~ ), summarise, cor = round(cor(val1, val2), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=0.5, y=0.5)

What's the right way to do it?

mand3rd
  • 383
  • 1
  • 12
neversaint
  • 60,904
  • 137
  • 310
  • 477
  • What does type have to do with your desired plot image? There is a ggAlly package with a ggpairs function that may be useful. As it stands I am struggling to see the connection between your example data and the desired plot. – mnel Mar 18 '13 at 06:17
  • 2
    It is particularly confusing that you refer to mpg and wt, which are not in your data – alexwhan Mar 18 '13 at 06:31
  • @neversaint just out of curiosity: how did you draw http://i.stack.imgur.com/r3wBT.jpg ? Did you use a software application? Is it https://www.fiftythree.com/paper ? – Alessandro Jacopson Nov 19 '14 at 15:18

3 Answers3

8

Some of your code was incorrect. This works for me:

p <- ggplot(all, aes(val1, val2))+ geom_smooth(method = "lm")  + geom_point() +
  facet_grid(~type) 
# Calculate correlation for each group
cors <- ddply(all, .(type), summarise, cor = round(cor(val1, val2), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=1, y=-0.25)

enter image description here

Edit: Following OP's comment and edit. The idea is to re-create the data with all four combinations and then facet.

# I consider the type in your previous data to be xx and yy
dat <- data.frame(val1 = c(rep(all$val1[all$type == "x"], 2), 
                           rep(all$val1[all$type == "y"], 2)), 
                  val2 = rep(all$val2, 2), 
                  grp1 = rep(c("x", "x", "y", "y"), each=10), 
                  grp2 = rep(c("x", "y", "x", "y"), each=10))

p <- ggplot(dat, aes(val1, val2)) + geom_point() + geom_smooth(method = "lm") + 
     facet_grid(grp1 ~ grp2)
cors <- ddply(dat, .(grp1, grp2), summarise, cor = round(cor(val1, val2), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=1, y=-0.25)

enter image description here

Arun
  • 116,683
  • 26
  • 284
  • 387
alexwhan
  • 15,636
  • 5
  • 52
  • 66
  • Not quite. It should create 2x2 grid. See *blue* font in the drawing for the combination. – neversaint Mar 18 '13 at 05:56
  • 3
    So what do you want on the diagonal? Where it says 'need not be plotted' - but it is plotted in your diagram? – alexwhan Mar 18 '13 at 05:58
  • I put that in the diagram just to show in each grid, which combination of values are use for correlation. – neversaint Mar 18 '13 at 05:59
  • 1
    @alexwhan, you'll have to reshape the data to get all 4 combinations of x and y, if I understand it right and have a separate group for each of them, then facet_wrap with ncol=2 perhaps? – Arun Mar 18 '13 at 07:20
  • Yes, I understand what's being asked now, won't be at my computer for a while - @Arun, do you feel like doing it? – alexwhan Mar 18 '13 at 07:23
  • 1
    @alexwhan, you had the answer, the question was unclear at that moment. It'd be unfair to take the answer away from you :). – Arun Mar 18 '13 at 12:32
4

Since your data is not in the appropriate format, some reshaping is necessary before it can be plotted.

Firstly, reshape the data to the long format:

library(reshape2)
allM <- melt(all[-1], id.vars = "type")

Split the values along type and val1 vs. val2:

allList <- split(allM$value, interaction(allM$type, allM$variable))

Create a list of all combinations:

allComb <- unlist(lapply(c(1, 3), 
                  function(x)
                    lapply(c(2 ,4), 
                           function(y) 
                             do.call(cbind, allList[c(x, y)]))), 
           recursive = FALSE)

Create a new dataset:

allNew <- do.call(rbind, 
                  lapply(allComb, function(x) {
                                    tmp <- as.data.frame(x)
                                    tmp <- (within(tmp, {xval <- names(tmp)[1]; 
                                                         yval <- names(tmp)[2]}))
                                    names(tmp)[1:2] <- c("x", "y")
                                    tmp}))

Plot:

library(ggplot2)
p <- ggplot(allNew, aes(x = x, y = y)) + 
       geom_smooth(method = "lm")  + 
       geom_point() +
       facet_grid(yval ~ xval) 
# Calculate correlation for each group
library(plyr)
cors <- ddply(allNew, .(yval, xval), summarise, cor = round(cor(x, y), 2))
p + geom_text(data=cors, aes(label=paste("r=", cor, sep="")), x=0.5, y=0.5)

enter image description here

Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
1

There is an additional package ggpubr available now addressing exactly this issue with the stat_cor() function.

library(tidyverse)
library(ggpubr)
ggplot(all, aes(val1, val2))+ 
  geom_smooth(method = "lm")  + 
  geom_point() +  
  facet_grid(~type) +
  stat_cor()

enter image description here

Roman
  • 17,008
  • 3
  • 36
  • 49