Why do variable lookups in the body of function A take values from the global environment but not function B that calls A?

Question

I defined a function:

.get <- function( o, ...) {
    p <- match.call( expand.dots = 0)$...
    cat( sprintf( 'In .get, it is %s.\n', eval( tail( p, 1)[[ 1]])))
    fn <- switch( typeof( o), list =, environment = `[[`, 'S4' = '@', `[`)
    if( length( p)) eval( as.call( c( fn, quote( o), p))) else o # Here when true, I compose a call based on p.
}

Then I tried it as follows:

it <- 1
m <- matrix( seq( 9), 3)
sapply( seq( 3), function( it) {
    cat( sprintf( 'In sapply, it is: %s.\n', it))
    .get( m, , it)
})
sapply( seq( 3), function( it) .get( m, , it))

The output:

In sapply, it is: 1.
In .get, it is 1.
In sapply, it is: 2.
In .get, it is 1.
In sapply, it is: 3.
In .get, it is 1.
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    2    2
[3,]    3    3    3

But the expected output is:

In sapply, it is: 1.
In .get, it is 1.
In sapply, it is: 2.
In .get, it is 2.
In sapply, it is: 3.
In .get, it is 3.
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

So why is it not 1 to 3 (the value it has where the function was called), but always the value assigned in the global environment (i.e. 1)?

How would you make a proof of concept? What would you have to change to show what is actually going on? Your example is quite complex, perhaps you should start with a simple function that prints variables in different environments? Here is an example where `pa` takes value of `a` from `pb`, not global environment. `a <- 1; pa <- function(a) {print(a)}; pb <- function(a) {a <- "inside";pa(a)};pb(a)`. — Roman Luštrik, Apr 23 '14 at 08:03
Thanks for your comment, Roman! I added some printed messages to better illustrate the situation as you suggested. For you small example, I think it is indeed quite simple and behaves as expected, but it does not help solve my question. Thanks again! — calvin, Apr 24 '14 at 05:29

score 10 · Accepted Answer · edited May 23 '17 at 10:32

Did you define get in the global environment together with it? If so, then this might be a scoping-issue. See here and here for an excellent discussion.

Look at this example from the first link:

 a = 1
 b = 2

 fun <- function(x){ a + b*x }

 new.fun <- function(x){
 a = 2
 b = 1
 fun(x)
 }

 new.fun(2)

If fun called within new.fun uses a and b from the global environment, we expect the outcome of new.fun(2) to be 1+2*2=5 whereas if it using the parameters defined in the function new.fun, then it should be 2+1*2=4. Now, most people expect the outcome to be 4, but it will be 5. Why? Because fun was defined in the global environment, and hence the global variables a and b matter for fun. To see this, you can look at the structure of the function with str(fun) which will reveal that an environment is attached to the function. Looking into that environment with list(environment(fun)), you will see that the function "remembers" that it was defined in the global environment. For that reason, the function fun will look there first to find the parameters a and b.

To adress the isssue, many workarounds have been proposed, several of which can be found if you google lexical scoping. For background information, Hadley Wickam's upcoming book has an excellent section on environments, see here. For potential solutions, see, for instance here. One way to solve your issue is to overwrite the environment. For instance,

 new.fun2 <- function(x){
 a = 2
 b = 1
 environment(fun) = environment()
 fun(x)
 }

 new.fun2(2)

now gives 4 as the answer, using a=2, b=1 as defined in the parent environment, as opposed to the global environment. I am sure there are many more elegant solutions though.

That is, in your case, using

 sapply( seq( 3), function(it) {
   cat( sprintf( 'In sapply, it is: %s.\n', it))
   environment(.get) <- environment()
   .get( m, , it)
 })

works.

Dear coffeinjunky, I think you are right. Then how to solve this? I think such behaviors are not so expected, right? It is more **natural** to assume that in your case `a` and `b` are to be taken from `new.fun`, rather than from somewhere not so relevant. Is there any way to go the natural way in R? — calvin, Apr 25 '14 at 15:58
Bindings in closures could be useful, but I think it is more reasonable to **first** lookup in the most relevant places, and, if fails, then use bindings in closures. Why does not R go this way? Or do you know which programming languages go this way? Thanks! — calvin, Apr 25 '14 at 16:05
Wait, `it` is in the closure of `sapply( ...)`, not with `.get`. If you are right, then I don't understand why `sapply( seq( 3), function( it) it)` outputs `[1] 1 2 3` as expected... — calvin, Apr 25 '14 at 16:11
So, basically, `fun` "remembers" that it was defined in the environment `.GlobalEnv`, so it looks there for any variables. Try `str(fun)` and observe that there is an environment attached. Then, try `list(environment(fun))` to see that it knows it was defined in the global environment. For your `sapply` function, `it` is clearly defined as the argument of the function, thus a local variable. — coffeinjunky, Apr 25 '14 at 16:51
OK, I see. Now I use `with` in the body of `.get`. It seems fine now :) Thanks, coffeinjunky! — calvin, Apr 26 '14 at 03:38
Really thanks a lot for such detailed answers! Now I would like to find one of the most 'economic' solutions in terms of overhead (of both coding and computation). — calvin, Apr 27 '14 at 04:51

score 2 · Answer 2 · answered Apr 28 '14 at 22:02

Another solution is using constructors:

make.get <- function(it){
it <- it
.get <- function( o, ...) {
    p <- match.call( expand.dots = 0)$...
    cat( sprintf( 'In .get, it is %s.\n', eval( tail( p, 1)[[ 1]])))
    fn <- switch( typeof( o), list =, environment = `[[`, 'S4' = '@', `[`)
    if( length( p)) eval( as.call( c( fn, quote( o), p))) else o # Here when true, I compose a call based on p.
}
}

it <- 1
m <- matrix( seq( 9), 3)
sapply( seq( 3), function(it) {
cat( sprintf( 'In sapply, it is: %s.\n', it))
.get <- make.get(it)
.get( m, , it)
})

I think this should work. But I feel it is a little bit of too many codes... And the names and number of variables passed as parameters could change. — calvin, May 02 '14 at 13:30

score 2 · Answer 3 · answered Apr 30 '14 at 16:55

2

Languages with lexical scope have functions with only passed parameters and global variables. Languages with dynamic scope have functions which use the environment of the caller. Lexical scoping has won because it is easier to reason about (and it stateless languages like Haskell, it even provides referential transparency). There are still some dynamically scoped languages, like bash and (optionally) common lisp. Its interesting that you expected dynamic scoping as a default; at some point I had the same expectation.

answered Apr 30 '14 at 16:55

seewalker

1,123
10
18

For referential transparency, I 'heard' that it means that given a function and an input value, you will always receive the same output. So if the input value is a variable, and one gets the same result everywhere (given one does not change its global value), then what is the purpose of using variables? Also, in such cases, one needs to care about what have been done to a variable, which is not so 'transparent'. Actually, in such cases, isn't dynamic scoping more tractable? As it is more likely that one knows what have been done to free local variables. – calvin May 02 '14 at 14:15
Sorry, should be 'free variables', not 'free local variables'. – calvin May 02 '14 at 14:33

Why do variable lookups in the body of function A take values from the global environment but not function B that calls A?

3 Answers3

Linked