0
  lapply(7:12, function(x) ggplot(mydf)+geom_histogram(aes(mydf[,x])))

will give an error Error in [.data.frame(mydf, , x) : undefined columns selected.

I have used several SO questions (e.g. this) as guidance, but can't figure out my error.

Community
  • 1
  • 1
  • The problem is that `ggplot` doesn't know about `x` when you call it inside of `aes()`. – Señor O Nov 07 '14 at 16:43
  • Think about the scope of these variables. X is within aes() which is within ggplot() that is within the function(). – squishy Nov 07 '14 at 16:45
  • @ahburr No, `aes` is evaluated later when x is out of scope. – Señor O Nov 07 '14 at 16:49
  • 1
    You should be using `aes_string(x)` here rather than `aes(mydf[,x])`. Please read up on the differences. – MrFlick Nov 07 '14 at 17:08
  • "ggplot doesn't know about x when you call it inside of aes()" - that means there is a scope issue. Understanding scope of your variables when writing these lines of R code with multiple functions interacting / within each other is often overlooked and easy to mix up. I think the author, @koenbro, may learn a lot from thinking about this problem in terms of scope. – squishy Nov 07 '14 at 17:23

1 Answers1

1

The code below works with the mtcars dataset. Just replace mtcars with mydf.

library(ggplot2)
lapply(1:3,function(i) {
  ggplot(data.frame(x=mtcars[,i]))+
    geom_histogram(aes(x=x))+
    ggtitle(names(mtcars)[i])
  })

Notice how the reference to i (the column index) was moved from the mapping argument (the call to aes(...)), to the data argument.

Your problem is actually quite subtle. ggplot evaluates the arguments to aes(...) first in the context of your data - e.g. it looks for column names in mydf. If that fails it jumps to the global environment. It does not look in the function's environment. See this post for another example of this behavior and some discussion.

The bottom line is that it is a really bad idea to use external variables in a call to aes(...). However, the data=... argument does not suffer from this. If you must refer to a column number, etc., do it in the call to ggplot(data=...).

Community
  • 1
  • 1
jlhoward
  • 58,004
  • 7
  • 97
  • 140