ggplot2 plots / results are different within and outside of loop [Bug?]

Question

The my simple case:

Plotting graphs within the loop brings different results than plotting it directly after the loop

# Initialize
Input <- list(c(3,3,3,3),c(1,1,1,1))
  y <- c()
  x <- c()
  plotlist <- c()
  Answer <- c()

  # create helper grid
  x.grid = c(1:4)
  y.grid = c(1:4)
  helpergrid <- expand.grid(xgrid=x.grid, ygrid=y.grid )

  #- Loop Lists -
  for (m in c(1,2))
  { 

    # # Loop within each list
    # for(j in 1:4)
    # {
    #   y[j] <- Input[[m]][j]
    #   x[j] <- j
    # }

    y[1] <- Input[[m]][1]
    x[1] <- 1
    y[2] <- Input[[m]][2]
    x[2] <- 2
    y[3] <- Input[[m]][3]
    x[3] <- 3
    y[4] <- Input[[m]][4]
    x[4] <- 4

    Points <- data.frame(x, y)

 # Example Plot
    plot = ggplot() + labs(title = paste("Loop m = ",m)) + labs(subtitle = paste("y-values = ",Points$y)) + geom_tile(data = helpergrid, aes(x=xgrid, y=ygrid, fill=1), colour="grey20") + geom_point(data = Points, aes(x=Points$x, y=Points$y), stroke=3, size=5, shape=1, color="white") + theme_minimal()

    # Plot to plotlist
    plotlist[[m]] <- plot

    # --- Plot plotlist within loop ---
    plot(plotlist[[m]])
   }

  # --- Plot plotlist outside of loop ---
 plot(plotlist[[1]])
 plot(plotlist[[2]])

Here is an image of the results: Plot Results

as aaumai is pointing out that there is a nested loop that might cause the issue for ggplot using static values, however the resulting plot 'is' showing the correct y-value (y=3) explicitely, but the geom_points are using the wrong values (y=1)...

It makes absolutely (!) no sense to me, I am relatively new to R and trying to debug this for hours now - so I hope someone can help me with this !!

EDIT: I manually removed the nested loop and updated the example code, but the problem still persists :(

In R, normally you rarely need nested loops. Take a look at either `lapply` or `map` functions in these examples https://stackoverflow.com/a/55524126/786542 | https://stackoverflow.com/a/50930640/786542 | https://stackoverflow.com/a/52045613/786542 — Tung, Jul 15 '19 at 03:21

score 0 · Answer 1 · answered Jul 14 '19 at 16:40

0

If I'm not mistaken, this is because you have a loop within the loop.

The plot within the loop returns plots for changing y values in the Points data (from 1 to 4), whereas the plot outside is only plotting the static values.

answered Jul 14 '19 at 16:40

aaumai

263
1
7

Hm... but nested loops are possible I guess, maybe not for ggplot2? Is there a workaround, like reassigning or caching the results before exiting the loop? – bambamfox Jul 14 '19 at 16:42
Maybe it would help to find the problem if you explained what you're trying to achieve with these plots? It's indeed weird that it places the y points at 1 when the label is at three having both Points$y – aaumai Jul 14 '19 at 16:53

alan ocallaghan · Accepted Answer · 2019-07-15T10:46:26.147

The problem arises due to your use of Points$x within aes. The "tl;dr" is that basically you should never use $ or [ or [[ within aes. See the answer here from baptiste.


library(ggplot2)
# Initialize
Input <- list(c(3,3,3,3),c(1,1,1,1))
y <- c()
x <- c()
plotlist <- c()
Answer <- c()

# create helper grid
x.grid = c(1:4)
y.grid = c(1:4)
helpergrid <- expand.grid(xgrid=x.grid, ygrid=y.grid )

#- Loop Lists -
for (m in c(1,2)) { 


  y[1] <- Input[[m]][1]
  x[1] <- 1
  y[2] <- Input[[m]][2]
  x[2] <- 2
  y[3] <- Input[[m]][3]
  x[3] <- 3
  y[4] <- Input[[m]][4]
  x[4] <- 4

  Points <- data.frame(x, y)

  # Example Plot
  plot = ggplot() + labs(title = paste("Loop m = ",m)) + labs(subtitle = paste("y-values = ",force(Points$y))) + 
    geom_tile(data = helpergrid, aes(x=xgrid, y=ygrid, fill=1), colour="grey20") + 
    geom_point(data = Points, aes(x=x, y=y), stroke=3, size=5, shape=1, color="white") + theme_minimal()

  # Plot to plotlist
  plotlist[[m]] <- plot

  # --- Plot plotlist within loop ---
  print(plotlist[[m]])
}

# --- Plot plotlist outside of loop ---
print(plotlist[[1]])
print(plotlist[[2]])

I believe the reason this happens is due to lazy evaluation. The data passed into geom_tile/point gets stored, but when the plot is printed, it grabs Points$x from the current environment. During the loop, this points to the current state of the Points data frame, the desired state. After the loop is finished, only the second version of Points exists, so when the referenced value from aes is evaluated, it grabs the x values from Points$x as it exists after the second evaluation of the loop. Hope this is clear, feel free to ask further if not.

To clarify, if you remove Points$ and just refer to x within aes, it takes these values from the data.frame as it was passed into the data argument of the geom calls.

Thank you, you [aocall](https://stackoverflow.com/users/4747043/aocall) are an angel and saving my day. Also thank you for the explanation and reference!! — bambamfox, Jul 15 '19 at 09:49

ggplot2 plots / results are different within and outside of loop [Bug?]

2 Answers2