3

In my code, I needed to check which package the function is defined from (in my case it was exprs(): I needed it from Biobase but it turned out to be overriden by rlang).
From this SO question, I thought I could use simply environmentName(environment(functionname)). But for exprs from Biobase that expression returned empty string:

environmentName(environment(exprs))
# [1] ""

After checking the structure of environment(exprs) I noticed that it has .Generic member which contains package name as an attribute:

environment(exprs)$.Generic
# [1] "exprs"
# attr(,"package")
# [1] "Biobase"

So, for now I made this helper function:

pkgparent <- function(functionObj) {
  functionEnv <- environment(functionObj)
  envName <- environmentName(functionEnv)
  if (envName!="") 
    return(envName) else
      return(attr(functionEnv$.Generic,'package'))
}

It does the job and correctly returns package name for the function if it is loaded, for example:

pkgparent(exprs)
# Error in environment(functionObj) : object 'exprs' not found

library(Biobase)

pkgparent(exprs)
# [1] "Biobase"

library(rlang)
# The following object is masked from ‘package:Biobase’:
#   exprs

pkgparent(exprs)
# [1] "rlang"

But I still would like to learn how does it happen that for some packages their functions are defined in "unnamed" environment while others will look like <environment: namespace:packagename>.

Vasily A
  • 8,256
  • 10
  • 42
  • 76
  • A note on functional programming: `if (…) return(a) else return(b)` is fundamentally imperative style, and breaks with the mental model that’s natural to R. The natural way to express this in R is simply as `if … a else b` or, if you insist on explicit `return` (which is a bad idea in my opionion), `return(if (…) a else b)`. – Konrad Rudolph Feb 20 '20 at 08:31
  • thanks for the note Konrad, I will keep it in mind. Somehow I thought exactly the opposite - that this way is more clear and easier to read. Apparently I'm too far from R's natural "mental model" :) Btw, why explicit `return` is a bad idea? – Vasily A Feb 20 '20 at 08:41
  • Because it’s misleading: *every* function in R returns a value (in fact, it’s probably better of thinking of it in terms of function calls *having a value*), regardless of whether `return` was used or not. In particular, using `return` doesn’t guarantee that no other code path in a function returns a value. More details: https://stackoverflow.com/a/59090751/1968 – Konrad Rudolph Feb 20 '20 at 08:46

1 Answers1

1

What you’re seeing here is part of how S4 method dispatch works. In fact, .Generic is part of the R method dispatch mechanism.

The rlang package is a red herring, by the way: the issue presents itself purely due to Biobase’s use of S4.

But more generally your resolution strategy might fail in other situations, because there are other reasons (albeit rarely) why packages might define functions inside a separate environment. The reason for this is generally to define a closure over some variable.

For example, it’s generally impossible to modify variables defined inside a package at the namespace level, because the namespace gets locked when loaded. There are multiple ways to work around this. A simple way, if a package needs a stateful function, is to define this function inside an environment. For example, you could define a counter function that increases its count on each invocation as follows:

counter = local({
    current = 0L

    function () {
        current <<- current + 1L
        current
    }
})

local defines an environment in which the function is wrapped.

To cope with this kind of situation, what you should do instead is to iterate over parent environments until you find a namespace environment. But there’s a simpler solution, because R already provides a function to find a namespace environment for a given environment (by performing said iteration):

pkgparent = function (fun) {
    nsenv = topenv(environment(fun))
    environmentName(nsenv)
}
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Thank you Konrad! Variant with `topenv()` is obviously much better. It’s still not totally clear for me what exactly happens differently between those two cases (I mean, package that creates named environment accessible by `environmentName(environment(fun))` versus package creating unnamed one where I need to use `topenv()`) - I guess I will need to read that R Language Definition, oh well… But for a short answer – would it be correct to say that `environmentName(environment(fun))` will not work for any package that uses S4 implementation? – Vasily A Feb 20 '20 at 16:01
  • @VasilyA Whether it works depends on the *function*, not on the package. It won’t work on S4 functions, nor on functions defined in any kind of local environment, as shown in the example in my answer. It will work on all functions defined normally inside a package (i.e. by assigning a function directly to a name, rather than using e.g. `setMethod` for S4 methods). This includes non-S4 functions from the Biobase package, such as `note`. – Konrad Rudolph Feb 20 '20 at 16:17