19

The package data.table has some special syntax that requires one to use expressions as the i and j arguments.

This has some implications for how one write functions that accept and pass arguments to data tables, as is explained really well in section 1.16 of the FAQs.

But I can't figure out how to take this one additional level.

Here is an example. Say I want to write a wrapper function foo() that makes a specific summary of my data, and then a second wrapper plotfoo() that calls foo() and plots the result:

library(data.table)


foo <- function(data, by){
  by <- substitute(by)
  data[, .N, by=list(eval(by))]
}

DT <- data.table(mtcars)
foo(DT, gear)

OK, this works, because I get my tabulated results:

   by  N
1:  4 12
2:  3 15
3:  5  5

Now, I try to just the same when writing plotfoo() but I fail miserably:

plotfoo <- function(data, by){
  by <- substitute(by)
  foo(data, eval(by))
}
plotfoo(DT, gear)

But this time I get an error message:

Error: evaluation nested too deeply: infinite recursion / options(expressions=)?

OK, so the eval() is causing a problem. Let's remove it:

plotfoo <- function(data, by){
  by <- substitute(by)
  foo(data, by)
}
plotfoo(DT, gear)

Oh no, I get a new error message:

Error in `[.data.table`(data, , .N, by = list(eval(by))) : 
  column or expression 1 of 'by' or 'keyby' is type symbol. Do not quote column names. Useage: DT[,sum(colC),by=list(colA,month(colB))]

And here is where I remain stuck.

Question: How to write a function that calls a function that calls data.table?

Andrie
  • 176,377
  • 47
  • 447
  • 496
  • Not a solution, but if you remove `substitute(by)` and `eval` and pass `gear` as a character variable like `foo(DT, "gear")` then both works. – Arun Feb 12 '13 at 17:42

2 Answers2

14

This will work:

plotfoo <- function(data, by) {
  by <- substitute(by)
  do.call(foo, list(quote(data), by))
}

plotfoo(DT, gear)
#    by  N
# 1:  4 12
# 2:  3 15
# 3:  5  5

Explanation:

The problem is that your call to foo() in plotfoo() looks like one of the following:

foo(data, eval(by))
foo(data, by)

When foo processes those calls, it dutifully substitutes for the second formal argument (by) getting as by's value the symbols eval(by) or by. But you want by's value to be gear, as in the call foo(data, gear).

do.call() solves this problem by evaluating the elements of its second argument before constructing the call that it then evaluates. As a result, when you pass it by, it evaluates it to its value (the symbol gear) before constructing a call that looks (essentially) like this:

foo(data, gear)
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • +1 This works very well, thank you, even for my implied but unstated requirement for passing on the `i` and `j` arguments. – Andrie Feb 12 '13 at 20:07
5

I think you might be tieing yourself up in knots. This works:

library(data.table)
foo <- function(data, by){
  by <- by
  data[, .N, by=by]
}

DT <- data.table(mtcars)
foo(DT, 'gear')

plotfoo <- function(data, by){
  foo(data, by)
}
plotfoo(DT, 'gear')

And that method supports passing in character values:

> gg <- 'gear'
> plotfoo <- function(data, by){
+   foo(data, by)
+ }
> plotfoo(DT, gg)
   gear  N
1:    4 12
2:    3 15
3:    5  5
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • 1
    Sorry to bother you but I'm wondering what's the meaning of `by <- by` inside foo. – vodka Feb 12 '13 at 17:48
  • Ah, yes, you're quite right. I apologise, in my attempt to simplify my example, I removed the original problem of passing an argument to the `i` or `j`. I'm sorry - I'll edit my question. – Andrie Feb 12 '13 at 17:50
  • @vodka: No particular meaning. Just left over from editing the original. – IRTFM Feb 12 '13 at 18:57
  • @Andrie: Did either Josh's or my answer get at the issues you were having with i, and j evaluation? – IRTFM Feb 12 '13 at 20:30
  • Yes, the solution by Josh works very well indeed, although I don't claim to understand it! – Andrie Feb 12 '13 at 21:28
  • Heh, I will admit that one of my favorite functions was written with `do.call` after considerable mental pain trying other things. In some ways `do.call` is like a `paste0` in the "language sub-universe of R". At least that's the way I think of it. – IRTFM Feb 12 '13 at 21:31