3

I wrote a wrapper around ftable because I need to compute flat tables with frequency and percentage for many variables. As ftable method for class "formula" uses non-standard evaluation, the wrapper relies on do.call and match.call to allow the use of the subset argument of ftable (more details in my previous question).

mytable <- function(...) {
    do.call(what = ftable,
            args = as.list(x = match.call()[-1]))
    # etc
}

However, I cannot use this wrapper with lapply nor with:

# example 1: error with "lapply"
lapply(X = warpbreaks[c("breaks",
                        "wool",
                        "tension")],
       FUN = mytable,
       row.vars = 1)

Error in (function (x, ...)  : object 'X' not found

# example 2: error with "with"
with(data = warpbreaks[warpbreaks$tension == "L", ],
     expr = mytable(wool))

Error in (function (x, ...)  : object 'wool' not found

These errors seem to be due to match.call not being evaluated in the right environment.

As this question is closely linked to my previous one, here is a sum up of my problems:

  • The wrapper with do.call and match.call cannot be used with lapply or with.
  • The wrapper without do.call and match.call cannot use the subset argument of ftable.

And a sum up of my questions:

  • How can I write a wrapper which allows both to use the subset argument of ftable and to be used with lapply and with? I have ideas to avoid the use of lapply and with, but I am looking to understand and correct these errors to improve my knowledge of R.
  • Is the error with lapply related to the following note from ?lapply?

    For historical reasons, the calls created by lapply are unevaluated, and code has been written (e.g., bquote) that relies on this. This means that the recorded call is always of the form FUN(X[[i]], ...), with i replaced by the current (integer or double) index. This is not normally a problem, but it can be if FUN uses sys.call or match.call or if it is a primitive function that makes use of the call. This means that it is often safer to call primitive functions with a wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is required to ensure that method dispatch for is.numeric occurs correctly.

Thomas
  • 457
  • 2
  • 12

3 Answers3

2

The problem with using match.call with lapply is that match.call returns the literal call that passed into it, without any interpretation. To see what's going on, let's make a simpler function which shows exactly how your function is interpreting the arguments passed into it:

match_call_fun <- function(...) {
    call = as.list(match.call()[-1])
    print(call)
}

When we call it directly, match.call correctly gets the arguments and puts them in a list that we can use with do.call:

match_call_fun(iris['Species'], 9)

[[1]]
iris["Species"]

[[2]]
[1] 9

But watch what happens when we use lapply (I've only included the output of the internal print statement):

lapply('Species', function(x) match_call_fun(iris[x], 9))

[[1]]
iris[x]

[[2]]
[1] 9

Since match.call gets the literal arguments passed to it, it receives iris[x], not the properly interpreted iris['Species'] that we want. When we pass those arguments into ftable with do.call, it looks for an object x in the current environment, and then returns an error when it can't find it. We need to interpret

As you've seen, adding envir = parent.frame() fixes the problem. This is because, adding that argument tells do.call to evaluate iris[x] in the parent frame, which is the anonymous function in lapply where x has it's proper meaning. To see this in action, let's make another simple function that uses do.call to print ls from 3 different environmental levels:

z <- function(...) {
    print(do.call(ls, list()))
    print(do.call(ls, list(), envir = parent.frame()))
    print(do.call(ls, list(), envir = parent.frame(2)))
}

When we call z() from the global environment, we see the empty environment inside the function, then the Global Environment:

z()

character(0)                                  # Interior function environment
[1] "match_call_fun" "y"              "z"     # GlobalEnv
[1] "match_call_fun" "y"              "z"     # GlobalEnv

But when we call from within lapply, we see that one level of parent.frame up is the anonymous function in lapply:

lapply(1, z)

character(0)                                  # Interior function environment
[1] "FUN" "i"   "X"                           # lapply
[1] "match_call_fun" "y"              "z"     # GlobalEnv

So, by adding envir = parent.frame(), do.call knows to evaluate iris[x] in the lapply environment where it knows that x is actually 'Species', and it evaluates correctly.

mytable_envir <- function(...) {
    tab <- do.call(what = ftable,
                   args = as.list(match.call()[-1]),
                   envir = parent.frame())
    prop <- prop.table(x = tab,
                       margin = 2) * 100
    bind <- cbind(as.matrix(x = tab),
                  as.matrix(x = prop))
    margin <- addmargins(A = bind,
                         margin = 1)
    round(x = margin,
          digits = 1)
}



# This works!
lapply(X = c("breaks","wool","tension"),
       FUN = function(x) mytable_envir(warpbreaks[x],row.vars = 1))

As for why adding envir = parent.frame() makes a difference since that appears to be the default option. I'm not 100% sure, but my guess is that when the default argument is used, parent.frame is evaluated inside the do.call function, returning the environment in which do.call is run. What we're doing, however, is calling parent.frame outside do.call, which means it returns one level higher than the default version.

Here's a test function that takes parent.frame() as a default value:

fun <- function(y=parent.frame()) {
    print(y)
    print(parent.frame())
    print(parent.frame(2))
    print(parent.frame(3))
}

Now look at what happens when we call it from within lapply both with and without passing in parent.frame() as an argument:

lapply(1, function(y) fun())
<environment: 0x12c5bc1b0>     # y argument
<environment: 0x12c5bc1b0>     # parent.frame called inside
<environment: 0x12c5bc760>     # 1 level up = lapply
<environment: R_GlobalEnv>     # 2 levels up = globalEnv

lapply(1, function(y) fun(y = parent.frame()))
<environment: 0x104931358>     # y argument
<environment: 0x104930da8>     # parent.frame called inside
<environment: 0x104931358>     # 1 level up = lapply
<environment: R_GlobalEnv>     # 2 levels up = globalEnv

In the first example, the value of y is the same as what you get when you call parent.frame() inside the function. In the second example, the value of y is the same as the environment one level up (inside lapply). So, while they look the same, they're actually doing different things: in the first example, parent.frame is being evaluated inside the function when it sees that there is no y= argument, in the second, parent.frame is evaluated in the lapply anonymous function first, before calling fun, and then is passed into it.

divibisan
  • 11,659
  • 11
  • 40
  • 58
  • Thank you very much for this detailed answer. It helps me to understand more in depth. However, about why `parent.frame()` is needed although it is the default argument, I don't see why the behavior of the default argument would be different as if the same argument were specified manually... – Thomas Apr 24 '19 at 21:33
  • Thank you very much again, for your edit this time ! I now understand why adding `envir = parent.frame()` makes a difference even if it is `do.call` default argument. You deserve more than +1 for the help you provided ! NB: `with(data = warpbreaks, expr = z())`, `with(warpbreaks, fun())` and `with(warpbreaks, fun(y = parent.frame()))` also demonstrate that the problem was the same with `with`. – Thomas Apr 24 '19 at 22:15
  • No problem, it was fun to figure out! I tend to avoid environments wherever possible in my own work since I don't fully understand them. So it's good for me to take the time to dig into how they actually work – divibisan Apr 24 '19 at 22:23
  • I finally realize the wrapper fails with `mapply` although it works with `lapply`. Can you please have a look to my new question: https://stackoverflow.com/q/56817969/11148823 ? – Thomas Jun 29 '19 at 13:38
  • 1
    For the record, about why adding `envir = parent.frame()` makes a difference even if it is do.call default argument: "One of the most important things to know about the evaluation of arguments to a function is that supplied arguments and default arguments are treated differently. The supplied arguments to a function are evaluated in the evaluation frame of the calling function. The default arguments to a function are evaluated in the evaluation frame of the function." (from https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Argument-evaluation) – Thomas Oct 06 '21 at 19:39
0

As you only want to pass all the arguments passed to ftable u do not need the do.call().

mytable <- function(...) {
  tab <- ftable(...)
  prop <- prop.table(x = tab,
                     margin = 2) * 100
  bind <- cbind(as.matrix(x = tab),
                as.matrix(x = prop))
  margin <- addmargins(A = bind,
                       margin = 1)
  return(round(x = margin,
               digits = 1))
}

The following lapply creates a table for every Variable separatly i don't know if that is what you want.

lapply(X = c("breaks",
             "wool",
             "tension"),
       FUN = function(x) mytable(warpbreaks[x],
                                 row.vars = 1))

If you want all 3 variables in 1 table

warpbreaks$newVar <- LETTERS[3:4]

lapply(X = cbind("c(\"breaks\", \"wool\", \"tension\")",
             "c(\"newVar\", \"tension\",\"wool\")"),
       FUN = function(X)
        eval(parse(text=paste("mytable(warpbreaks[,",X,"],
                                 row.vars = 1)")))
)
Swolf
  • 329
  • 2
  • 7
  • Thank you for your answer. However, as explained in my question, I need `do.call` to use the subset argument of `ftable` method for class "formula" because it uses non-standard evaluation (more details on [my previous question](https://stackoverflow.com/questions/55754330/write-a-wrapper-around-a-function-relying-on-non-standard-evaluation-in-r)). – Thomas Apr 24 '19 at 08:05
0

Thanks to this issue, the wrapper became:

# function 1
mytable <- function(...) {
    do.call(what = ftable,
            args = as.list(x = match.call()[-1]),
            envir = parent.frame())
    # etc
}

Or:

# function 2
mytable <- function(...) {
    mc <- match.call()
    mc[[1]] <- quote(expr = ftable)
    eval.parent(expr = mc)
    # etc
}

I can now use the subset argument of ftable, and use the wrapper in lapply:

lapply(X = warpbreaks[c("wool",
                        "tension")],
       FUN = function(x) mytable(formula = x ~ breaks,
                                 data = warpbreaks,
                                 subset = breaks < 15))

However I do not understand why I have to supply envir = parent.frame() to do.call as it is a default argument.

More importantly, these methods do not resolve another issue: I can not use the subset argument of ftable with mapply.

Thomas
  • 457
  • 2
  • 12
  • I've posted an answer that hopefully explains what's happening here in more depth, though you basically figured it out already yourself. For your "bonus questions", you should ask new questions for them – the StackOverflow format is made for 1 question per question – divibisan Apr 24 '19 at 15:37