14

I recently learned about the elegant R package data.table. I am very curious to know how the J function is implemented there. This function is bound to the function [.data.table, it doesn't exist in the global environment.

I downloaded the source code but I cannot find the definition for this J function anywhere there. I found lockBind(".SD", ...), but not J. Any idea how this function is implemented?

Many thanks.

Arun
  • 116,683
  • 26
  • 284
  • 387
yuez
  • 885
  • 1
  • 8
  • 16
  • 1
    You might find http://adv-r.had.co.nz/dsl.html helpful for describing the general techniques by which this sort of thing is implemented. – hadley Feb 25 '14 at 16:59

1 Answers1

14

J() used to be exported before, but not since 1.8.8. Here's the note from 1.8.8:

o The J() alias is now removed outside DT[...], but will still work inside DT[...]; i.e., DT[J(...)] is fine. As warned in v1.8.2 (see below in this file) and deprecated with warning() in v1.8.4. This resolves the conflict with function J() in package XLConnect (#1747) and rJava (#2045). Please use data.table() directly instead of J(), outside DT[...].

Using R's lazy evaluation, J(.) is detected and simply replaced with list(.) using the (invisible) non-exported function .massagei.

That is, when you do:

require(data.table)
DT = data.table(x=rep(1:5, each=2L), y=1:10, key="x")
DT[J(1L)]

i (= J(1L)) is checked for its type and this line gets executed:

i = eval(.massagei(isub), x, parent.frame())

where isub = substitute(i) and .massagei is simply:

.massagei = function(x) {
    if (is.call(x) && as.character(x[[1L]]) %chin% c("J","."))
        x[[1L]] = quote(list)
    x
}

Basically, data.table:::.massagei(quote(J(1L))) gets executed which returns list(1L), which is then converted to data.table. And from there, it's clear that a join has to happen.

Arun
  • 116,683
  • 26
  • 284
  • 387