3

I'm having some trouble figuring out what's exactly going on here with respect to "environment nesting"/lexical scoping:

The problem

The default value of argument where in function getClasses() seems to vary depending on whether getClasses() is called inside a standard R function or a formal S4 method. It is controlled by .externalCallerEnv() which seems to be "object" to lazy evaluation and thus causes the variation (see EDIT below)

The question

When called from inside a formal S4 method, how do I set where to the same value that is the default value when getClasses() is called inside a standard function?


ILLUSTRATION

Below you'll find a short illustration of the "problematic behavior" described above

1) Custom classes

I've got numerous class defs that are currently sourced to .GlobalEnv.

Let's take this one as a representative for all of them

setRefClass("A", fields=list(x="numeric"))

2) Listing available classes

Via argument where, function getClasses lets me choose the environment in which to look for classes.

The following seems to look everywhere except .GlobalEnv and thus doesn't find my class; that's fine:

classes <- getClasses()
> head(classes)
[1] "("            ".environment" ".externalptr" ".name"        ".NULL"       
[6] ".Other"   
> "A" %in% classes
[1] FALSE

Now I look in .GlobalEnv and find class A only; that's fine too:

classes <- getClasses(where=.GlobalEnv)
> classes
[1] "A"
> "A" %in% classes
[1] TRUE

3) Creating a custom standard lookup function

When I put the lookup via getClasses into a standard function (this is just the first part of a desired functionality and I'd like to compute getClasses() inside that method rather than passing it's return value as an formal argument), everything still works fine

foo1 <- function(where=.GlobalEnv) {
    if (is.null(where)) {
        x <- getClasses()
    } else {
        x <- getClasses(where=where)    
    }
    return(x)
}
> foo1()
[1] "A"
> classes <- foo1(where=NULL)
> head(classes)
[1] "("            ".environment" ".externalptr" ".name"        ".NULL"       
[6] ".Other"    
> "A" %in% classes
[1] FALSE

4) Creating a formal S4 method

However, once I put everything into a formal S4 method, there seems to be some changes with respect to the standard environment that getClasses() uses to look for classes

setGeneric(
    name="foo2",
    signature="x",
    def=function(x, ...) standardGeneric("foo2")       
)
setMethod(
    f="foo2", 
    signature=signature(x="missing"), 
    definition=function(
        x,
        where=.GlobalEnv
    ) {       
    if (is.null(where)) {
        x <- getClasses()
    } else {
        x <- getClasses(where=where)    
    }
    return(x)        
    }
)
[1] "foo2"
> foo2()
[1] "A"
> classes <- foo2(where=NULL)
> head(classes)
[1] "A"            "("            ".environment" ".externalptr" ".name"       
[6] ".NULL"  
> "A" %in% classes
[1] TRUE

Before, "A" %in% foo1(where=NULL) was FALSE (desired) whereas "A" %in% foo2(where=NULL) is TRUE now (not desired).

Any ideas how foo2() would behave the exact same way as foo1()?


EDIT 2012-08-29

As Josh O'Brien pointed out in his comment below, the variation is probably caused by lazy evaluation.

Debugging foo1()

debug(getClasses)
foo1(where=NULL)

You enter the debugging tracer; hit <RETURN> 4 times followed by typing get("where"):

Browse[2]> get("where")
<environment: namespace:base>

In the console, hit <RETURN> 1 time followed by typing evList:

Browse[2]> evList
[[1]]
<environment: namespace:base>

Type Q to quit the current debugging run

Now run everything again, but with slightly different debugging calls

foo1(where=NULL)

In the console, hit <RETURN> 5 times followed by typing evList:

Browse[2]> evList
[[1]]
<environment: namespace:methods>

Now type get("where"):

Browse[2]> get("where")
<environment: namespace:methods>

Now where points to namespace:methods

Debugging `foo2()'

foo2(where=NULL)

You enter the debugging tracer; hit <RETURN> 4 times followed by typing get("where"):

Browse[2]> get("where")
<environment: namespace:base>

Then hit <RETURN> 1 time followed by typing evList:

Browse[2]> evList
[[1]]
<environment: namespace:base>

Type Q to quit the current debugging run

Now run everything again, but with slightly different debugging calls

foo2(where=NULL)

Hit <RETURN> 5 times followed by typing evList:

Browse[2]> evList
[[1]]
<environment: 0x02a68db8>

[[2]]
<environment: R_GlobalEnv>

# [OMITTED]

[[8]]
<environment: package:methods>
attr(,"name")
[1] "package:methods"
attr(,"path")
[1] "R:/Apps/LSQMApps/apps/R/R-2.14.1/library/methods"

[[9]]
<environment: 0x01e8501c>
attr(,"name")
[1] "Autoloads"

[[10]]
<environment: namespace:base>

Now type get("where"):

Browse[2]> get("where")
<environment: 0x02a68db8>

and note the different values of evList and where compared to the debugging run before. Type Q to quit the current debugging run.

This seems somewhat strange to me, but probably makes sense from the language designers' perspective. I'd probably be fine once I know how to explicitly set where to point to the environment associated with the namespace:methods.

JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
Rappster
  • 12,762
  • 7
  • 71
  • 120
  • 1
    +1 Very interesting. I'm guessing it's got to do with lazy evaluation of `getClasses()`' `where = .externalCallerEnv()` argument, as the following seems to indicate. Try `debug(getClasses);foo2(where=NULL)` followed by `; ; ; get("where"); ; evList; get("where")`, vs. `foo2(where=NULL)` followed by `; ; ; ; evList; get("where")`. – Josh O'Brien Aug 28 '12 at 22:28
  • @JoshO'Brien: cool, thanks a lot! I definitely have to get more used to using the available debugging functionality ;-) I included your comments as an edit in the original post. This behavior seems somewhat strange to me, but probably makes sense from the language designer's perspective. I think I found a solution (see answer below) but this "variable default behavior" due to lazy evaluation still bothers me a little - but that might just be me. Do you think it's worth letting the guys at r-devel know, or does this all make perfect sense? – Rappster Aug 29 '12 at 09:13
  • I wouldn't report this, as it looks like `getClasses()` is performing just as it was intended to (see `where=` in the `?getClasses`). My guess is that `getClasses()` simply wasn't designed for use inside of an S3 function: it's `foo1`'s behavior that's wonky, and you shouldn't expect `foo2`'s behavior to follow it. `foo1(where=NULL)` and `foo2(where=NULL)` give different results because `environment(foo1)` and `environment(foo2)` are different (even though `ls()` shows they're both located in the global environment). So lazy evaluation, though it occurs, is a red herring here... – Josh O'Brien Aug 29 '12 at 12:06

1 Answers1

2

A well known lesson learned yet again: it's always good to be explicit ;-)

Thanks to Josh O'Brien and this post by Etiennebr I guess I'm able to put the pieces together.

Due to lazy evaluation and lexical scoping, I think the only way to really make sure getClasses() behaves the same way as if called from either .GlobalEnv or from inside a regular function is to explicitly set the value of where to the environment associated to namespace:methods when calling getClasses() inside a formal S4 method.

To get the environment associated to the namespace, this seems to work:

env <- loadNamespace("methods")
> is.environment(env)
[1] TRUE

Alternatively or even better:

env <- asNamespace("methods")
> is.environment(env)
[1] TRUE

This environment is exactly the one we need to point where to

setGeneric(
    name="foo2",
    signature="x",
    def=function(x, ...) standardGeneric("foo2")       
)
setMethod(
    f="foo2", 
    signature=signature(x="missing"), 
    definition=function(
        x,
        where=.GlobalEnv
    ) {       
    if (is.null(where)) {
        x <- getClasses(where=asNamespace("methods"))
    } else {
        x <- getClasses(where=where)
    }
    return(x)       
    }
)

> foo2()
[1] "A"
> classes <- foo2(where=NULL)
> head(classes)
[1] "("            ".environment" ".externalptr" ".name"        ".NULL"       
[6] ".Other"      
> "A" %in% classes
[1] FALSE
Community
  • 1
  • 1
Rappster
  • 12,762
  • 7
  • 71
  • 120
  • 2
    A minor note: I'd prob. use `asNamespace("methods")` instead of `loadNamespace("methods")` since the namespace is already loaded and you are just wanting a pointer to it. – Josh O'Brien Aug 29 '12 at 12:10
  • @JoshO'Brien: great, thanks. That seemed superfluous to me as well, but I didn't know how to do it differently. – Rappster Aug 29 '12 at 12:32