5

I have an expression expr that I want to evaluate; the symbol/value pairs I need to evaluate it may be in one (or more!) of three environments, and I'm not sure which. I'd like to find a convenient, efficient way to "chain" the environments. Is there a way to do this safely while avoiding the copying of contents of environments?

Here's the setup:

env1 <- list2env(list(a=1))
env2 <- list2env(list(b=1))
env3 <- list2env(list(c=1))
expr <- quote(a+b)

So, I will need to evaluate expr in the combination of environments env1 and env2 (but I don't necessarily know that). Any of eval(expr, env1); eval(expr, env2); and eval(expr,env3) will fail, because none of those environments contains all of the required symbols.

Let's suppose I'm willing to assume that the symbols are either in env1+env2 or in env1+env3. I could:

  1. Create combined environments for each of those pairs as described in this question.

problems:

  • if I use one of the solutions that involves creating new environments, and one of my environments has a lot of stuff in it, this could be expensive.
  • using parent.env()<- could be a bad idea — as described in ?parent.env:

The replacement function parent.env<- is extremely dangerous as it can be used to destructively change environments in ways that violate assumptions made by the internal C code. It may be removed in the near future.

(although, according the source history, that warning about removal "in the near future" is at least 19 years old ...)

(in fact I've already managed to induce some infinite loops playing with this approach)

  1. use
tryCatch(eval(call, envir=as.list(expr1), enclos=expr2),
         error=function(e) {
             tryCatch(eval(call, as.list(expr1), enclos=expr3))

to create an "environment within an environment"; try out the combined pairs one at a time to see which one works. Note that enclos= only works when envir is a list or pairlist, which is why I have to use as.list().

problem: I think I still end up copying the contents of expr1 into a new environment.

I could use an even more deeply nested set of tryCatch() clauses to try out the environments one at a time before I resort to copying them, which would help avoid copying where unnecessary (but seems clunky).

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Why can't you `attach` and `detach` before and after? – Allan Cameron Oct 08 '20 at 19:16
  • 2
    Perhaps because `attach` is ... just ... so ... baaaaaaaaad :-) – r2evans Oct 08 '20 at 19:17
  • Is the problem with copying that you don't want to copy *any* of the environments' variables, or just don't want to copy *all* of them? For instance, if we copy just the variables found in the expression, is that okay? – r2evans Oct 08 '20 at 19:49
  • it's better ... Gabor's argument suggests that env -> list isn't going to hurt me though (I don't expect to be modifying ...) – Ben Bolker Oct 08 '20 at 20:07

5 Answers5

5

Convert the enviroments to lists, concatenate them and use that as the second arg of eval. Note that this does not modify the environments themselves.

L <- do.call("c", lapply(list(env1, env2, env3), as.list))
eval(expr, L)
## [1] 2

Also note that this does not copy the contents of a, b and c. They are still at the original addresses:

library(pryr)

with(env1, address(a))
## [1] "0x2029f810"

with(L, address(a))
## [1] "0x2029f810"
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • 1
    And this doesn't deep copy? – Allan Cameron Oct 08 '20 at 19:19
  • 1
    Have added explanation. – G. Grothendieck Oct 08 '20 at 19:25
  • 1
    It does assume that the parents of the environments are irrelevant. That's true in the reprex, but might not be true for general environments, e.g. if `a` was in the parent of `env1`, `eval(quote(a), envir=env1)` would still work, but not after applying `as.list` to `env1`. – user2554330 Oct 08 '20 at 19:38
  • An environment isn't a list. Its parent is an inherent part of it. – user2554330 Oct 08 '20 at 22:19
  • @user2554330, Not only is this not part of the quesiton but it is not even desirable. What if b is in the parent of env1? You might wind up with the wrong b. – G. Grothendieck Oct 08 '20 at 22:21
  • @G.Grothendieck: yes, that's a serious issue with the setup. My main recommendation is still "Don't do that.". If there are two variables named `b`, one in the parent of `env1`, one in `env2`, how do you know that the one in `env2` is the right one? It's ambiguous. A setup like this is really hard to describe to users in a way that they will understand. Using lists instead of environments would make everything much less ambiguous. – user2554330 Oct 08 '20 at 23:15
3

No, there's no simple way to chain environments. As you know, every environment has a parent which is another environment, so overall environments form a tree structure. (The root of the tree is the empty environment.) You can't easily take a leaf from a tree and graft it onto another leaf without making structural changes to it.

So if you really need to evaluate your expression in the way you describe, you're going to have to parse it, look up the names yourself, and substitute values into it. But even this isn't necessarily going to give you the same value at the end, because substitute() and similar functions might be involved in it.

My advice would be to start over, and don't try to make an expression like the one you're talking about. This might involve copying, but remember that copying is usually cheap in R: the cost only comes if you modify one of the copies.

Edited to add:

The current other four answers are implicitly making assumptions that env1 to env3 are as simple as they are in your example. If that's true, then I'd go with @G.Grothendieck's solution. But all fail in this simple variation on your example:

env1 <- list2env(list(a=1))
env2parent <- list2env(list(b=1))
env2 <- new.env(parent = env2parent)
env3 <- list2env(list(c=1))
expr <- quote(a+b)

I can evaluate quote(b) using eval(quote(b), envir = env2), but I can't evaluate expr using the other solutions unless I also include env2parent in the list of environments being passed.

Edited again:

Here's a solution that essentially does what I suggested, except instead of parsing, it uses the all.vars function from one of @r2evans answers. It works by copying all the variables into a common environment, so copying happens, but the names are kept:

envfunc3 <- function(expr, ...) {
  vars <- all.vars(expr)
  env <- new.env()
  for (v in vars) {
    for (e in list(...))
      if (exists(v, envir = e)) {
        assign(v, get(v, envir = e), envir = env)
        break
      }
  }
  eval(expr, envir=env)
}
user2554330
  • 37,248
  • 4
  • 43
  • 90
2

Another tactic: temporarily rebuild the chain of parent environments, use R's natural search order, and change them back.

I recognize the reference to "in the future" and your discouraging of using parent.env, but ... it still works really well. I believe most of the "risk" of using it (and therefore the discouraging comment in the doc) is that changing it and not changing it back introduces many avenues for things to break. I do see some fragility here in that my assumption of expr is that it is relatively "simple"; if there are (for instance) active bindings that rely on C libraries, perhaps this could cause a problem.

For now, though ...

envfunc <- function(expr, ...) {
  envs <- list(...)
  if (length(envs) > 1) {
    parents <- lapply(envs, parent.env)
    on.exit({
      for (i in seq_along(envs)) parent.env(envs[[i]]) <- parents[[i]]
    }, add = TRUE)
    for (i in seq_along(envs)[-1]) parent.env(envs[[i]]) <- envs[[i-1]]
  }
  eval(expr, envir = envs[[ length(envs) ]])
}

str(list(env1,env2,env3))
# List of 3
#  $ :<environment: 0x0000000099932bc8> 
#  $ :<environment: 0x0000000099931d58> 
#  $ :<environment: 0x00000000445b97c0> 
str(lapply(list(env1,env2,env3), parent.env))
# List of 3
#  $ :<environment: 0x000000000787d7a8> 
#  $ :<environment: 0x000000000787d7a8> 
#  $ :<environment: 0x000000000787d7a8> 
str(lapply(list(env1, env2, env3), function(e) lapply(e, address)))
# List of 3
#  $ :List of 1
#   ..$ a: chr "00000000bb23c350"
#  $ :List of 1
#   ..$ b: chr "00000000bb23c1c8"
#  $ :List of 1
#   ..$ c: chr "00000000bb23c040"

envfunc(expr, env1, env2, env3)
# [1] 2

str(list(env1,env2,env3))
# List of 3
#  $ :<environment: 0x0000000099932bc8> 
#  $ :<environment: 0x0000000099931d58> 
#  $ :<environment: 0x00000000445b97c0> 
str(lapply(list(env1,env2,env3), parent.env))
# List of 3
#  $ :<environment: 0x000000000787d7a8> 
#  $ :<environment: 0x000000000787d7a8> 
#  $ :<environment: 0x000000000787d7a8> 
str(lapply(list(env1, env2, env3), function(e) lapply(e, address)))
# List of 3
#  $ :List of 1
#   ..$ a: chr "00000000bb23c350"
#  $ :List of 1
#   ..$ b: chr "00000000bb23c1c8"
#  $ :List of 1
#   ..$ c: chr "00000000bb23c040"

This is effectively producing a linked-list of environments, which means that the order of environments provided matters. In this example there is no duplication, but it's not hard to imagine that it could have an impact.

envfunc(expr, env1, env2, env3)
# [1] 2
env1$b <- 99
envfunc(expr, env1, env2, env3)
# [1] 2
env3$b <- 99
envfunc(expr, env1, env2, env3)
# [1] 100
r2evans
  • 141,215
  • 6
  • 77
  • 149
  • 1
    Remember that environments are reference objects. Changing the parent of `envs[[i]]` in `envfunc` will change the parent of the original environment. If `env1` happens to be `globalenv()` or `environment(mean)` or something else that doesn't belong to you, that could cause very weird errors even if it is only for the duration of the `envfunc` call. – user2554330 Oct 08 '20 at 19:53
  • Yeah, that was the risk I was thinking. – r2evans Oct 08 '20 at 20:06
1

Another option, completely different than the other: active bindings? I might be stretching on this one ...

envfunc2 <- function(expr, ...) {
  vars <- all.vars(expr)
  env <- environment()
  for (e in list(...)) {
    vars_in_e <- intersect(vars, names(e))
    vars <- setdiff(vars, vars_in_e)
    for (v in vars_in_e) makeActiveBinding(v, local({ v=v; e=e; function() get(v, envir = e); }), env)
  }
  eval(expr)
}

envfunc2(expr, env1, env2, env3)
# [1] 2

This includes the overhead of getting the values from their respective environment just-in-time.

r2evans
  • 141,215
  • 6
  • 77
  • 149
1

I think attach() is really what you want here, the reasons why it's the most hated R function are not relevant in your case, we can build a careful wrapper detaching everything on.exit and it should be safe :

eval_with_envs <- function(expr, ...) {
  dots <- substitute(...())
  on.exit(
    for (env in dots) {
      if(as.character(env) %in% search())
        eval.parent(bquote(detach(.(env))))
    }
  )
  for (env in dots) {
    eval.parent(bquote(attach(.(env))))
  }
  eval.parent(expr)
}

eval_with_envs(expr, env1, env2, env3)
#> [1] 2
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
  • this is a good point. Unfortunately it might still lead to arguments with CRAN (I seem to recall that using `attach()` triggers a NOTE, which needs to be justified ...) – Ben Bolker Oct 08 '20 at 20:16
  • This also fails in the counterexample I added to my negative answer. `attach()` basically does a shallow copy, it ignores the parents of the attached environments. – user2554330 Oct 08 '20 at 20:29
  • As I understood the question, we want to evaluate in these environments and are not concerned about inheritance, the rejected solution using `parent.env<-` seems to support this, Ben isn't it the case ? – moodymudskipper Oct 08 '20 at 20:32
  • 1
    no, I am still kinda worried about inheritance too. I may come back and edit my question to give some more context ... – Ben Bolker Oct 08 '20 at 20:34
  • The solution I just posted handles inheritance fine, but it does it by copying the variables into a new environment. I don't think there's anything simpler. – user2554330 Oct 08 '20 at 21:34