1

The problem is that I have a data set where I want to plot n "y" variables against one "x" variable in ggplot2. I want to then make multiple multiplots for k levels of a factor and save all k mulitplots in one file.

For example, consider the mtcars data. I can generate a figure of y=hp vs. x=mph and a separate figure of y=wt vs. x=mph and put them together into one multiplot by:

library(dplyr)
library(ggplot2)
library(gridExtra)
library(cowplot)

a1 <- ggplot(mtcars,aes(mpg,hp))+geom_point()
b1 <- ggplot(mtcars,aes(mpg,wt))+geom_point()
p <- grid.arrange(a1,b1)

OK, now I want to create the same multiplot but for different levels of the factor "am". (Edit: I would like to have one multiplot for am=0 and one multiplot for am=1) I found a solution for creating plots based on a factor and saving in one file [here]How subset a data frame by a factor and repeat a plot for each subset?

I tried to modify the above code for my problem, the following is my attempt:

plots = mtcars %>%
group_by(am) %>%
do({a = a1 %+% .
b = b1 %+% .
plots = p %+% .})

I also tried:

plots = mtcars %>%
group_by(am) %>%
do({a1 = ggplot(.,aes(mpg,hp))+geom_point()
b1 = ggplot(.,aes(mpg,wt))+geom_point()
p = grid.arrange(a1,b1)})

in both situations I have the error

Error: Results are not data frames at positions: 1, 2

I understand that there is a data frame problem. But I don't understand why it is a problem in my code and not in the sample code. Any help is appreciated! Thanks in advance.

EDIT:

Following Tims post below, gather can be used to achieve the end result that I am looking for.

library(tidyr)
dat1 <- mtcars %>% 
gather(key, value, hp, wt)

p <- ggplot(dat1,aes(mpg, value)) + 
geom_point() + 
facet_wrap(~ key, scales = "free_y") 

plots = dat1 %>%
group_by(am) %>%
do(
plots = p %+% .)

pdf()
plots$plots
dev.off()

However, this doesn't allow much in the way of customization for the individual plots for different variables. Say I wanted to add a line using geom_vline to plot hp vs. mpg but not have it on plot wt vs. mpg. I'm not sure you could do that in this method.

Community
  • 1
  • 1
Micky
  • 190
  • 11

1 Answers1

0

Unless I'm misunderstanding your question I think you are making this a little more complicated than it needs to be. You are wanting a grid of two plots, hp ~ mpg and wt ~ mpg as well as showcase the value of am.

My initial response is to use tidyr::gather to group hp and wt:

library(ggplot2)
library(tidyr)

mtcars %>% 
  gather(key, value, hp, wt)

Now instead of variables hp and wt you have variables key (which contain either 'hp' or 'wt' as values) and value which contains the respective value of 'hp' or 'wt'.

After that you build your initial plot (notice I pipe in the first statement as the data parameter to ggplot:

mtcars %>% 
  gather(key, value, hp, wt) %>% 
  ggplot(aes(mpg, value, color = as.factor(am)) + 
  geom_point()

In the aesthetics I have requested am (as a factor) to be used to distinguish color.

Now, you want a grid layout so facet_wrap() becomes your friend.

mtcars %>% 
  gather(key, value, hp, wt) %>% 
  ggplot(aes(mpg, value, color = as.factor(am))) + 
  geom_point() + 
  facet_wrap(~ key, nrow = 2, scales = "free_y")

With facet_wrap I'm asking ggplot2 to build a plot for each unique value of key which holds 'hp' and 'wt'. So I'll end up with two graphs on one plot. Since your original example was stacked, I use nrow = 2. And, since 'hp' and 'wt' are not similar in values you must use the scales = "free_y" parameter. This means that each graph will use its own y-axis to accurately display the data.

And lastly if you don't want "as.factor(am)" as your legend title (and who does?), use scale_color_discrete(). We use color because that is what we assigned the am variable to (rather than fill, size, shape, etc.) and then discrete because am is a discrete variable.

So your code ends up like this:

mtcars %>% 
  gather(key, value, hp, wt) %>% 
  ggplot(aes(mpg, value, color = as.factor(am))) + 
  geom_point() + 
  facet_wrap(~ key, nrow = 2, scales = "free_y") + 
  scale_color_discrete(guide = guide_legend(title = element_text("am")))

And your plot ends up like this:

enter image description here

If I misunderstood your question I'll be happy to edit as necessary.

timtrice
  • 307
  • 3
  • 12
  • A slight misunderstanding. I do not want all levels of "am" in one plot. In your example you have two plots in one multiplot. I am wanting 2 plots in one multiplot for am=0 and 2 plots in one multiplot for am=1. I would then like to save as pdf so that I have 2 plots on page 1 corresponding to am=0 and 2 plots on page 2 corresponding to am=1. – Micky Dec 08 '16 at 08:16