1

First of all, excuse me for the bad title. I'm still so confused about this behavior, that I wasn't able to describe it; however I was able to reproduce it and broke it down to an (goofy) example.

Please, could you be so kind and explain why other.list appears to be full of NULLs after calling lapply()?

some.list <- rep(list(rnorm(1)),33)
other.list <- rep(list(), length = 33)

lapply(seq_along(some.list), function(i, other.list) {
  other.list[[i]] <- some.list[[i]]
  browser()
}, other.list)

I watched this in debugging mode in RStudio. For certain i, other.list[[i]] gets some.list[[i]] assigned, but it will be NULLed for the next iteration. I want to understand this behavior so bad!

Robert Kirsten
  • 474
  • 2
  • 5
  • 12
  • Because ... you did not assign the result from the function call to the named-item "other .list". (and the "other.list" inside the function is not really accumulating the results, either). The super-assignment offered below is NOT the usual approach to working with data-objects by experienced R-users. The usual paradigm (assuming you were very confident of your function's behavior) would be `other.list <- lapply(seq_along(some.list), , some.list)`. Which is a somewhat long-winded way of saying: `other.list <- lapply(some.list, )` – IRTFM Jun 08 '15 at 17:33

2 Answers2

2

The reason is that the assignment is taking place inside a function, and you've used the normal assignment operator <-, rather than the superassignment operator <<-. When inside a function scope, IOW when a function is executed, the normal assignment operator always assigns to a local variable in the evaluation environment that is created for that particular evaluation of that function (returned by a call to environment() from inside the function with fun=NULL). Thus, your global other.list variable, which is defined in the global environment (returned by globalenv()), will not be touched by such an assignment. The superassignment operator, on the other hand, will follow the closure environment chain (can be followed recursively via parent.env()) back until it finds a variable with the name on the LHS of the assignment, and then it assigns to that. The global environment is always at the base of the closure environment chain. If no such variable is found, the superassignment operator creates one in the global environment.

Thus, if you change <- to <<- in the assignment that takes place inside the function, you will be able to modify the global other.list variable.

See https://stat.ethz.ch/R-manual/R-devel/library/base/html/assignOps.html.

Here, I tried to make a little demo to demonstrate these concepts. In all my assignments, I'm assigning the actual environment that contains the variable being assigned to:

oldGlobal <- environment(); ## environment() is same as globalenv() in global scope
(function() {
    newLocal1 <- environment(); ## creates a new local variable in this function evaluation's evaluation environment
    print(newLocal1); ## <environment: 0x6014cbca8> (different for every evaluation)
    oldGlobal <<- parent.env(environment()); ## target search hits oldGlobal in closure environment; RHS is same as globalenv()
    newGlobal1 <<- globalenv(); ## target search fails; creates a new variable in the global environment
    (function() {
        newLocal2 <- environment(); ## creates a new local variable in this function evaluation's evaluation environment
        print(newLocal2); ## <environment: 0x6014d2160> (different for every evaluation)
        newLocal1 <<- parent.env(environment()); ## target search hits the existing newLocal1 in closure environment
        print(newLocal1); ## same value that was already in newLocal1
        oldGlobal <<- parent.env(parent.env(environment())); ## target search hits oldGlobal two closure environments up in the chain; RHS is same as globalenv()
        newGlobal2 <<- globalenv(); ## target search fails; creates a new variable in the global environment
    })();
})();
oldGlobal; ## <environment: R_GlobalEnv>
newGlobal1; ## <environment: R_GlobalEnv>
newGlobal2; ## <environment: R_GlobalEnv>
bgoldst
  • 34,190
  • 6
  • 38
  • 64
  • This is rather interesting code, not so much for the reporting of environments but for the use of local functions. It suggests you come from another corner of the programming universe. (You _should_ drop the use of the unnecessary semicolons, though. Their only effect is to slow your code down.) The use of the super-assignment is generally advised against. It's use here as a teaching tool is reasonable, but more general use is deprecated. – IRTFM Jun 08 '15 at 16:56
1

I haven't run your code, but two observations:

  1. I usually avoid putting browser() as the last line inside a function because that gets treated as the return value

  2. other.list does not get modified by your lapply. You need to understand the basics of environments and that any bindings you make inside lapply do not hold outside of it. It's a design feature and the whole point is that lapply can't have side effects - you should only use its return value. You can either use the <<- operator instead of <- though I don't recommend that, or you can use the assign function instead. Or you can do it properly the way lapply is meant to be used:

    others.list <- lapply(seq_along(some.list), function(i, other.list) { some.list[[i]] })

Note that it's generally recommended to not make assignments inside lapply that change variables outside of it. lapply is meant to perform a function on every element and return a list, and that list should be all that lapply is used for

DeanAttali
  • 25,268
  • 10
  • 92
  • 118
  • I wasn't trying to copy the list. The code from my question was a simplification of what I was initially doing. But I wasn't aware of browser() being a return statement. Good to know! – Robert Kirsten Apr 26 '15 at 22:23
  • @RobertKirsten: It's not that `browser()` is a "return statement" but rather that the result of last evaluation in a function is the returned value of any function. – IRTFM Jun 08 '15 at 17:02