3

I'm using a for loop to assign ggplots to a list, which is then passed to plot_grid() (package cowplot). plot_grid places multiple ggplots side by side in a single figure. This works fine manually, but when I use a for loop, the last plot generated is repeated in each subframe of the figure (shown below). In other words, all the subframes show the same ggplot.

Here is a toy example:

require(cowplot)

dfrm <- data.frame(A=1:10, B=10:1)

v <- c("A","B")
dfmsize <- nrow(dfrm)
myplots <- vector("list",2)

count = 1
for(i in v){
    myplots[[count]] <- ggplot(dfrm, aes(x=1:dfmsize, y=dfrm[,i])) + geom_point() + labs(y=i)
    count = count +1
}
plot_grid(plotlist=myplots)

Expected Figure:

enter image description here

Figure from for loop:

enter image description here

I tried converting the list elements to grobs, as described in this question, like this:

mygrobs <- lapply(myplots, ggplotGrob)
plot_grid(plotlist=mygrobs)

But I got the same result.

I think the problem lies in the loop assignment, not plot_grid(), but I can't see what I'm doing wrong.

Community
  • 1
  • 1
  • [This answer](http://stackoverflow.com/a/26246791/2461552) goes through some of the nitty gritty of ggplot2's lazy evaluation in detail. – aosmith Sep 30 '16 at 21:10

4 Answers4

6

The answers so far are very close, but unsatisfactory in my opinion. The problem is the following - after your for loop:

myplots[[1]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, i]
myplots[[1]]$plot_env
#<environment: R_GlobalEnv>

myplots[[2]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, i]
myplots[[2]]$plot_env
#<environment: R_GlobalEnv>

i
#[1] "B"

As the other answers mention, ggplot doesn't actually evaluate those expressions until plotting, and since these are all in the global environment, and the value of i is "B", you get the undesirable results.

There are several ways of avoiding this issue, the simplest of which in fact simplifies your expressions:

myplots = lapply(v, function(col)
            ggplot(dfrm, aes(x=1:dfmsize, y=dfrm[,col])) + geom_point() + labs(y=col))

The reason this works, is because the environment is different for each of the values in the lapply loop:

myplots[[1]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, col]
myplots[[1]]$plot_env
#<environment: 0x000000000bc27b58>

myplots[[2]]$mapping
#* x -> 1:dfmsize
#* y -> dfrm[, col]
myplots[[2]]$plot_env
#<environment: 0x000000000af2ef40>

eval(quote(dfrm[, col]), env = myplots[[1]]$plot_env)
#[1]  1  2  3  4  5  6  7  8  9 10
eval(quote(dfrm[, col]), env = myplots[[2]]$plot_env)
#[1] 10  9  8  7  6  5  4  3  2  1

So even though the expressions are the same, the results are different.

And in case you're wondering what exactly is stored/copied to the environment of lapply - unsurprisingly it's just the column name:

ls(myplots[[1]]$plot_env)
#[1] "col"
eddi
  • 49,088
  • 6
  • 104
  • 155
  • I marked eddi's as the best answer because, by mapping, and then showing `plot_env` each iteration of the loop, it's really clear what's going on. @jrandall: I didn't realize aes performs non-standard evaluation, which, as you mention, is why it's better to use`aes_q. Kudos to others for mentioning `lapply` as a substitute for looping. – someguyinafloppyhat Sep 30 '16 at 22:47
  • @eddi I am trying to convince `ggplot` to respect the variable values at the time of plot generation during a `for` loop when I finally examine all plots stored in the plot list. Would you have any advice on [this](https://stackoverflow.com/questions/62423707/ggplots-stored-in-plot-list-to-respect-variable-values-at-time-of-plot-generatio)? – mavericks Jun 17 '20 at 07:40
4

I believe the problem here is that the non-standard evaluation of the aes method delays evaluating i until the plot is actually plotted. By the time of plotting, i is the last value (in the toy example "B") and thus the y aesthetic mapping for all plots refers to that last value. Meanwhile, the labs call uses standard evaluation and so the labels correctly refer to each iteration of i in the loop.

This can be fixed by simply using the standard evaluation version of the mapping function, aes_q:

require(cowplot)

dfrm <- data.frame(A=1:10, B=10:1)

v <- c("A","B")
dfmsize <- nrow(dfrm)
myplots <- vector("list",2)

count = 1
for(i in v){
    myplots[[count]] <- ggplot(dfrm, aes_q(x=1:dfmsize, y=dfrm[,i])) + geom_point() + labs(y=i)
    count = count +1
}
plot_grid(plotlist=myplots)
jrandall
  • 300
  • 1
  • 5
  • I like that you explicitly mention NSE. It can also be verified by actually printing the plot inside the loop before assigning it to the list, which actually gives the correct output (unlike printing it after the loop was run). – jakub Sep 30 '16 at 21:28
3

There is a nice explanation of what happens with ggplot2's lazy evaluation and for loops in [this answer](https://stackoverflow.com/a/26246791/2461552.

I usually switch to aes_string or aes_ for situations like this so I can use variables as strings in ggplot2.

I find lapply loops easier than a for loop in your case as initializing the list and using the counter can be avoided.

First, I add the x variable to the dataset.

dfrm$index = 1:nrow(dfrm)

Now, the lapply loop, looping through the columns in v.

myplots = lapply(v, function(x) {
    ggplot(dfrm, aes_string(x = "index", y = x)) + 
        geom_point() +
        labs(y = x)
})

plot_grid(plotlist = myplots)
Community
  • 1
  • 1
aosmith
  • 34,856
  • 9
  • 84
  • 118
2

I think ggplot is getting confused by looking for your x and y variables inside of dfrm even though you are actually defining them on the fly. If you change the for loop slightly to build a new sub data.frame as the first line it works just fine.

myplots <- list()
count = 1

for(i in v){
    df <- data.frame(x = 1:dfmsize, y = dfrm[,i])
    myplots[[count]] <- ggplot(df, aes(x=x, y=y)) + geom_point() + labs(y=i)
    count = count + 1
}
plot_grid(plotlist=myplots)

enter image description here

Nate
  • 10,361
  • 3
  • 33
  • 40