23

I'm having a problem getting data.table to work in roxygen2 exported functions.

Here's a simple, fake function in a file called foo.R (located in the R directory of my package) which uses data.table:

#' Data.table test function
#' @export
foo <- function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}

If I copy and paste this function into R, this function works fine:

> foo <- function() {
+   m <- data.table(c1 = c(1,2,3))
+   print(is.data.table(m))
+   m[,sum(c1)]
+ }
> foo()
[1] TRUE
[1] 6

But if I simply load the exported function, R thinks that the data.table is a data.frame and breaks:

> rm(foo)
> load_all()
Loading test_package
> foo
function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}
<environment: namespace:test_package>
> foo()
[1] TRUE
Error in `[.data.frame`(x, i, j) : object 'c1' not found

What's up?

UPDATE

Thanks to @GSee for the help. Looks like this is actually a devtools issue. Check out the interactive command line code below.

After loading the test_package library, foo runs correctly:

> foo
function ()
{
    m <- data.table(c1 = c(1, 2, 3))
    print(is.data.table(m))
    m[, sum(c1)]
}
<environment: namespace:test_package>
> foo()
[1] TRUE
[1] 6

Running load_all() breaks foo:

> load_all()
Loading test_package
> foo()
[1] TRUE
Error in `[.data.frame`(x, i, j) : object 'c1' not found

Somehow source('R/foo.R') revives foo functionality:

> source('R/foo.R')
> foo
function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}
> foo()
[1] TRUE
[1] 6

And future calls to load_all() don't break foo again:

> load_all()
Loading test_package
> foo
function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}
> foo()
[1] TRUE
[1] 6

Also, I updated to devtools 1.5 and tried adding .datatable.aware=TRUE, but that didn't seem to do anything.

kjmij
  • 512
  • 1
  • 4
  • 11
  • Are you sure the `foo` in your package is exactly the same (i.e. you didn't leave out the comma between `x[` and `col3`? – BrodieG Apr 23 '14 at 18:41
  • They're the same. I literally copy-and-pasted the function from the file foo.R (shown above) into the R command line. – kjmij Apr 23 '14 at 19:01
  • Check `yourpackagename::foo` on the command line to make sure that they really are the same. Just because the file you see on your editor looks like X doesn't mean the function in the package that is loaded in your R session is the same. – BrodieG Apr 23 '14 at 19:25
  • Yep, they look the same. I edited the question to include this check. – kjmij Apr 23 '14 at 20:00
  • Is data.table in Depends or Imports in your package? – GSee Apr 23 '14 at 23:17
  • I updated the sample code to more simply illustrate the problem. – kjmij Apr 23 '14 at 23:17
  • note that in the first code block above you have `m[,sum(c2)]` and the rest are `m[,sum(c1)]`... probably copy/paste error; not saying it's related to the problem – GSee Apr 23 '14 at 23:21
  • 1
    Do you have this problem if you build and load the package, or only if you're using `load_all()`? – GSee Apr 23 '14 at 23:24
  • Great idea. I built the package, installed it, ran library(test_package), and now it works. ...but I don't understand why. :) – kjmij Apr 23 '14 at 23:40
  • 4
    Try putting `.datatable.aware=TRUE` on a line in one of your R files and see if `load_all()` works. I think this [LINK](https://github.com/hadley/devtools/issues/192) might be related -- updating `devtools` may solve the problem. – GSee Apr 23 '14 at 23:52
  • Thanks for finding that link! `.datatable.aware=TRUE` and updating `devtools` didn't help, but sourcing the file directly seemed to negate `load_all()` breaking `foo`. – kjmij Apr 24 '14 at 03:37
  • It might be an interaction between devtools and data.table - data.table does some unusual stuff to change the behaviour of `[`. – hadley Apr 24 '14 at 12:25
  • @GSee, kjmij, hadley, just tested on `devtools` 1.5. Using `.datatable.aware=TRUE` works just fine! This brings back to the same issue GSee has linked. – Arun Apr 24 '14 at 14:12
  • @hadley, `get(".Depends", "package:")` does not return `data.table` when loaded with `devtools:::load_all()` and this is one of the things `data.table:::cedta()` checks under the hood to see if a package is *data.table aware*. Why doesn't `devtools` return "data.table", even if the package depends on it? – Arun Apr 24 '14 at 14:29
  • When you install the package and do: `require(test); ls("package:test", all=TRUE)`, you get ".Depends" and "foo", whereas this `.Depends` is not when loaded with `load_all()` from `devtools`. – Arun Apr 24 '14 at 14:36
  • Ah, I had put .datatable.aware within the `foo` function. But this works when I place it outside of the function. – kjmij Apr 24 '14 at 15:32
  • Thanks so much for the help all! This is my first SO post, so I'm not familiar with the common courtesies here. Do I "Answer Your Question" myself? Or do I leave that for @GSee since the .datatable.aware=TRUE suggestion was his/hers? – kjmij Apr 24 '14 at 15:38
  • @Arun because we don't create that data structure (probably because we didn't know about it). We'd love a devtools pull request that added this functionality – hadley Apr 24 '14 at 16:33
  • Is this the same issue as http://stackoverflow.com/q/24501245/513006? The only reason I wonder is that the `.data.table.aware` worked in that case, but not here. That question uses `devtools:create` to generate a reproducible example. – Abe Jul 02 '14 at 14:40
  • Can't replicate this issue anymore: https://github.com/hadley/devtools/issues/192#issuecomment-105451177 – krlmlr May 26 '15 at 08:53
  • @krlmlr I am using devtools_2.2.1 and I still had to add `.datatable.aware=TRUE` to get my package with data.table to work. Adding `Depends: data.table` to the DESCRIPTION file was not enough. – Roger J Bos CFA Nov 12 '19 at 20:38

1 Answers1

13

The issue, as @GSee pointed out (under comments) seems to be this issue still.

In order to find out if a package is data.table aware, data.table calls the function cedta(), which is:

> data.table:::cedta
function (n = 2L) 
{
    te = topenv(parent.frame(n))
    if (!isNamespace(te)) 
        return(TRUE)
    nsname = getNamespaceName(te)
    ans = nsname == "data.table" || "data.table" %chin% names(getNamespaceImports(te)) || 
        "data.table" %chin% tryCatch(get(".Depends", paste("package", 
            nsname, sep = ":"), inherits = FALSE), error = function(e) NULL) || 
        (nsname == "utils" && exists("debugger.look", parent.frame(n + 
            1L))) || nsname %chin% cedta.override || identical(TRUE, 
        tryCatch(get(".datatable.aware", asNamespace(nsname), 
            inherits = FALSE), error = function(e) NULL))
    if (!ans && getOption("datatable.verbose")) 
        cat("cedta decided '", nsname, "' wasn't data.table aware\n", 
            sep = "")
    ans
}
<bytecode: 0x7ff67b9ca190>
<environment: namespace:data.table>

The relevant check here is:

"data.table" %chin% get(".Depends", paste("package", nsname, sep=":"), inherits=FALSE)

When a package depends on data.table, the above command should return TRUE - that is, if you installed the package via R CMD INSTALL and then loaded the package. This is because, when you load the package, R by default creates a ".Depends" variable in the namespace as well. If you did:

ls("package:test", all=TRUE)
# [1] ".Depends" "foo"     

However, when you do devtools:::load_all(), this variable doesn't seem to be set.

# new session + set path to package's dir
devtools:::load_all()
ls("package:test", all=TRUE)
# [1] "foo"

So, cedta() doesn't get to know that this package indeed depends on data.table. However, when you manually set .datatable.aware=TRUE, the line:

identical(TRUE, get(".datatable.aware", asNamespace(nsname), inherits = FALSE))

gets executed, which will return TRUE and therefore overcomes the issue. But the fact that devtools doesn't place the .Depends variable in the package's namespace is still there.

All in all, this is really not an issue with data.table.

Arun
  • 116,683
  • 26
  • 284
  • 387