62

I am reading Hadley Wickhams's book on Github, in particular this part on lazy evaluation. There he gives an example of consequences of lazy evaluation, in the part with add/adders functions. Let me quote that bit:

This [lazy evaluation] is important when creating closures with lapply or a loop:

add <- function(x) {
  function(y) x + y
}
adders <- lapply(1:10, add)
adders[[1]](10)
adders[[10]](10)

x is lazily evaluated the first time that you call one of the adder functions. At this point, the loop is complete and the final value of x is 10. Therefore all of the adder functions will add 10 on to their input, probably not what you wanted! Manually forcing evaluation fixes the problem:

add <- function(x) {
  force(x)
  function(y) x + y
}
adders2 <- lapply(1:10, add)
adders2[[1]](10)
adders2[[10]](10)

I do not seem to understand that bit, and the explanation there is minimal. Could someone please elaborate that particular example, and explain what happens there? I am specifically puzzled by the sentence "at this point, the loop is complete and the final value of x is 10". What loop? What final value, where? Must be something simple I am missing, but I just don't see it. Thanks a lot in advance.

Maxim.K
  • 4,120
  • 1
  • 26
  • 43
  • 3
    Note that the answer to this question has changed as of R 3.2.0, see my answer below. – Eike P. Jun 08 '15 at 09:09
  • Complement to @jhin's comment: While `lapply()` has changed in recent R, the function `purrr::map()`, which is intended to be used wherever `lapply()` is, still behaves like the old `lapply()` vis-à-vis shared environments of closures. However, I wouldn't count on this “anachronism” of `purrr::map()` to stick around, as it will likely be rectified in future versions. – egnha Oct 21 '16 at 16:31
  • @jhin Actually, I guess hadley's tutorial is built directly from github so reading it after R 3.2.0 is now quite bizarre as that release made the whole section about lazy evaluation in that tutorial moot: there's no more difference with `adders` and `adders2`'s outputs! – green diod Dec 23 '16 at 23:13

2 Answers2

57

This is no longer true as of R 3.2.0!

The corresponding line in the change log reads:

Higher order functions such as the apply functions and Reduce() now force arguments to the functions they apply in order to eliminate undesirable interactions between lazy evaluation and variable capture in closures.

And indeed:

add <- function(x) {
  function(y) x + y
}
adders <- lapply(1:10, add)
adders[[1]](10)
# [1] 11
adders[[10]](10)
# [1] 20
Eike P.
  • 3,333
  • 1
  • 27
  • 38
36

The goal of:

adders <- lapply(1:10, function(x)  add(x) )

is to create a list of add functions, the first adds 1 to its input, the second adds 2, etc. Lazy evaluation causes R to wait for really creating the adders functions until you really start calling the functions. The problem is that after creating the first adder function, x is increased by the lapply loop, ending at a value of 10. When you call the first adder function, lazy evaluation now builds the function, getting the value of x. The problem is that the original x is no longer equal to one, but to the value at the end of the lapply loop, i.e. 10.

Therefore, lazy evaluation causes all adder functions to wait until after the lapply loop has completed in really building the function. Then they build their function with the same value, i.e. 10. The solution Hadley suggests is to force x to be evaluated directly, avoiding lazy evaluation, and getting the correct functions with the correct x values.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • 4
    Ok, let me rephrase that to see whether I am getting it right. When we call `lapply`, R sort of remembers the structure of all 10 adder functions, but does not evaluate x yet. When we call the first adder function, R says, aha, let's see what that is, takes x, which already is 10 at that point from the lapply call, and evaluates the first called adder function as 10 + y. Same for the remaining adder functions, rendering them all identical. Probably crudely put, but is that the logic of it? – Maxim.K Apr 21 '13 at 10:26
  • I believe that this is case. – Paul Hiemstra Apr 21 '13 at 11:10
  • 1
    @hadley When I call the first adder function, the lapply loop is already over. Where exactly does the adder function look to find x? Why does the value of x = 10 persists? – Heisenberg Aug 12 '14 at 22:54
  • How does the lazy evaluation actually work? All ten different adder functions each have ten separate environments in which to contain `x`. I suppose maybe they all point to somewhere prior to getting evaluated, but point to where? There's no `x` in the parent environment. – peterhurford Apr 19 '15 at 17:29
  • The environment is created when the function is called for the first time. The x variable is equal to 10 at that time after the lapply loop finished. So they are all the same. – Paul Hiemstra Apr 19 '15 at 17:46
  • By the way, my example code does not include an x, the example code does. I edited my question to remedy this. – Paul Hiemstra Apr 19 '15 at 18:06