Nested function environment selection

Question

I am writing some functions for doing repeated tasks, but I am trying to minimize the amount of times I load the data. Basically I have one function that takes some information and makes a plot. Then I have a second function that will loop through and output multiple plots to a .pdf. In both functions I have the following line of code:

if(load.dat) load("myworkspace.RData")

where load.dat is a logical and the data I need is stored in myworkspace.RData. When I am calling the wrapper function that loops through and outputs multiple plots I do not want to reload the workspace in every call to the inner function. I thought I could just load the workspace once in the wrapper function, then the inner function could access that data, but I got an error stating otherwise.

So my understanding was when a function cannot find the variable in its local environment (created when the function gets called), the function will look to the parent environment for the variable.

I assumed the parent environment to the inner function call would be the outer function call. Obviously this is not true:

func1 <- function(...){
  print(var1)
}

func2 <- function(...){
  var1 <- "hello"
  func1(...)
}

> func2()
Error in print(var1) : object 'var1' not found

After reading numerous questions, the language manual, and this really helpful blog post, I came up with the following:

var1 <- "hello"
save(list="var1",file="test.RData")
rm(var1)

func3 <- function(...){
  attach("test.RData")
  func1(...)
  detach("file:test.RData")
}

> func3()
[1] "hello"

Is there a better way to do this? Why doesn't func1 look for undefined variables in the local environment created by func2, when it was func2 that called func1?

Note: I did not know how to name this question. If anyone has better suggestions I will change it and edit this line out.

Lexical scoping means the function will look for undefined symbols in its parent environment, which is not necessarily the calling environment. Check this also: https://github.com/hadley/devtools/wiki/Environments — Ferdinand.kraft, Aug 20 '13 at 15:23
@Ferdinand.kraft Thanks for the link. I will work through that this afternoon. — dayne, Aug 20 '13 at 15:27
If your data is in form of dataframes, you could use package `data.table`, and pass your tables as an argument to `func1` inside `func3`. This package works by reference and does not make unwanted copies of your data. — Ferdinand.kraft, Aug 20 '13 at 15:31
Not quite sure why it isn't seeing `var1`, but note that `print(parent.frame()$var1)` works fine. — Richie Cotton, Aug 20 '13 at 15:47
@RichieCotton, `func1` looks for `var1` is its enclosing environment, i.e., where it belongs, which happens to be `R_GlobalEnv`. The call to `parent.frame()` inside `func1` inside `func3` returns the evaluation environment of `func3`, where `var1` belongs. (boy what a mess :-) — Ferdinand.kraft, Aug 20 '13 at 17:56
@dayne, I think you should be more specific in your question. How does your data look like? — Ferdinand.kraft, Aug 20 '13 at 19:37
@Ferdinand.kraft My data is two different data frames. I knew this question was kind of vague, but I am trying to better understand the environment definitions in r. Really I just do not understand why `func1` cannot see the environments created by `func2` or `func3`. Is there any way to get around this? Or to make functions look to other function environments? — dayne, Aug 20 '13 at 19:43
@dayne, it is intentional that `func1` cannot see those environments. When you type `func1 <- function...` in the console, you are creating an object of type closure which has an environment property, equal to `R_GlobalEnv`. This is where R will look for symbols not resolved in the evaluation of `func1`'s body. The *evaluation* environment created during the execution of `func2` or `func3` is *irrelevant* WRT symbol lookup. A workaround is to use `parent.frame()$var1`, as Richie pointed above, but it is very ugly. — Ferdinand.kraft, Aug 20 '13 at 19:52
@Ferdinand.kraft If you want to post that as an answer I will accept it. Thanks for your continued attention today! — dayne, Aug 20 '13 at 20:27

score 9 · Accepted Answer · edited May 23 '17 at 11:46

To illustrate lexical scoping, consider the following:

First let's create a sandbox environment, only to avoid the oh-so-common R_GlobalEnv:

sandbox <-new.env()

Now we put two functions inside it: f, which looks for a variable named x; and g, which defines a local x and calls f:

sandbox$f <- function()
{
    value <- if(exists("x")) x else "not found."
    cat("This is function f looking for symbol x:", value, "\n")
}

sandbox$g <- function()
{
    x <- 123
    cat("This is function g. ")
    f()
}

Technicality: entering function definitions in the console causes then to have the enclosing environment set to R_GlobalEnv, so we manually force the enclosures of f and g to match the environment where they "belong":

environment(sandbox$f) <- sandbox
environment(sandbox$g) <- sandbox

Calling g. The local variable x=123 is not found by f:

> sandbox$g()
This is function g. This is function f looking for symbol x: not found.

Now we create a x in the global environment and call g. The function f will look for x first in sandbox, and then in the parent of sandbox, which happens to be R_GlobalEnv:

> x <- 456
> sandbox$g()
This is function g. This is function f looking for symbol x: 456

Just to check that f looks for x first in its enclosure, we can put a x there and call g:

> sandbox$x <- 789
> sandbox$g()
This is function g. This is function f looking for symbol x: 789

Conclusion: symbol lookup in R follows the chain of enclosing environments, not the evaluation frames created during execution of nested function calls.

EDIT: Just adding a link to this very interesting answer from Martin Morgan on the related subject of parent.frame() vs parent.env()

This is the best illustration I have seen. Thank you so much! I was not really understanding the difference in environments and frames. — dayne, Aug 21 '13 at 12:54

score 2 · Answer 2 · answered Aug 20 '13 at 16:02

2

You could use closures:

f2 <- function(...){
   f1 <- function(...){
     print(var1)
   }
   var1 <- "hello"
   f1(...)
 }
 f2()

answered Aug 20 '13 at 16:02

Karl Forner

4,175
25
32

Right, but I need to be able to use the inner function as a stand-alone function. I did not want to have to redefine the inner function every time I call the outer function (not to mention duplicate a bunch of code). – dayne Aug 20 '13 at 16:24
Then the cleanest setting in my opinion: put all your data in a list (my_data), then give it as argument to your function. Inside the function you may use with(my_data, { } ) to avoid extra typing. – Karl Forner Aug 21 '13 at 08:04

Nested function environment selection

2 Answers2

Linked

Related