14

Using R's data.table package,

This works:

instruction = "a = data.table(name=1:3, value=1:3, blah=1:3); a[,c('value', 'blah'):=NULL]"
eval(parse(text=instruction))
#   name
#1:    1
#2:    2
#3:    3

This works:

myFunc = function(instruction) {
eval(parse(text=instruction))
}
myFunc(instruction)
#   name
#1:    1
#2:    2
#3:    3

Now, put this function into a package, load it, and try to call it. This doesn't work:

myFuncInPackage(instruction)
#Error in `:=`(c("value", "blah"), NULL) : 
#  Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").

Why?


EDIT: @Roland points out that adding data.table in the package Depends field makes it work. However, I don't think this is a great solution because the package doesn't really depend on, require, or use data.table. I just want to be able to use data.table with the package.

In addition, everything else with data.table works fine in the function, just not the := operator.

So I guess a followup question could be: should I add data.table to the Depends of every package I write, so that data.tables work as expected within functions of that package? This doesn't seem right... what is the correct way to approach this?

moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
nsheff
  • 3,063
  • 2
  • 24
  • 29
  • 2
    Have you followed the advice in [FAQ 6.9](http://cran.r-project.org/web/packages/data.table/vignettes/datatable-faq.pdf)? Also, use of `eval(parse())` is discouraged from. – Roland Jan 16 '15 at 09:40
  • @Roland Add data.table to Depends solves it... but leads to an issue: my package doesn't actually depend on data.table; in fact, it's totally unrelated. As in this example, it just has one function, `myFunc` -- no data.table anything. But it can't be used with data.table without adding it to Depends... – nsheff Jan 16 '15 at 12:31
  • @Roland, I know, `eval(parse())` is discouraged, and this is a pointless example, but the question still stands...in some cases I can't get around it. – nsheff Jan 16 '15 at 12:33
  • 2
    Your package has `eval(parse(text=instruction))` where `instruction` can be anything! At the time of evaluation any function required by `instruction` must be available; this should be specified in the usage instructions for your package. You're seeing this when `instruction` requires a function in `data.table`; load 'data.table' before executing `myFuncInPackage(instruction)` and see if it works. – user20637 Jan 16 '15 at 13:24
  • The `:=` operator that you use in _your_ function is defined within `data.table` package, so yes, your package _does_ depend on `data.table` – Sergii Zaskaleta Jul 02 '15 at 11:33
  • 1
    @ Sergii Zaskaleta No... I didn't use `:=` in my function. That was passed by the user, in the "instruction" variable. it has _nothing_ to do with the package... – nsheff Jul 04 '15 at 16:43
  • 3
    @sheffien can you check if you did update your `NAMESPACE` file to `import(data.table)` and `DESCRIPTION` to `Imports: data.table`? I got the same problem recently just because missing entry in `NAMESPACE` file. – jangorecki Jul 18 '15 at 11:58
  • @Moody_Mudskipper Please see https://meta.stackoverflow.com/questions/376593/do-we-need-the-colon-equals-tag – Luuklag Nov 13 '18 at 13:20

2 Answers2

8

I had same problem and I solved it adding data.table to Imports and Depends:. My data.table version is 1.9.6

Taz
  • 5,755
  • 6
  • 26
  • 63
  • Can you give an example? - - I have script which sources a function which uses `data.table`. I get the error here. I include `library(data.table)` in the script and/or in the function itself. - - Can you also give an example how you apply `Imports` and `Depends:` here to solve the problem. My data.table is 1.10.4. – Léo Léopold Hertz 준영 May 23 '17 at 19:04
  • 3
    It worked for me in R package context - not raw script. But answering to your question - you can apply it in `DESCRIPTION` file: `Imports: data.table (>= 1.9.6) Depends: data.table (>= 1.9.6) `, e.g.: https://pastebin.com/uy10Devh – Taz May 24 '17 at 08:29
  • Can you prevent packages to be loaded by such specifications? Etc `imports data.table` but prevent `reshape2` to be loaded as an own package. – Léo Léopold Hertz 준영 May 24 '17 at 08:36
  • 1
    You can load only specific functions from package by using `@import`, e.g.: `@importFrom jsonlite toJSON unbox`. Read more here: http://kbroman.org/pkg_primer/pages/depends.html – Taz May 24 '17 at 09:44
  • @Taz you can do this, but it doesn't solve the underlying problem; it only solves it for this package, not for other potential issues. See my new answer for a universal solution. – nsheff Aug 14 '17 at 21:23
  • 1
    In my case, it was enough to just mention `Imports: data.table` in the `DESCRIPTION` file. Even more, mentioning again in `Depends:` section would trigger a note from `devtools::check()` - `Package listed in more than one of Depends, Imports, Suggests, Enhances: data.table A package should be listed in only one of these fields.` – Valentin_Ștefan Feb 01 '18 at 23:08
  • I added `data.table` in depends and it works finally, removed from imports. This is extremely annoying, wasted 3 hours on this. Yikes. – Death Metal Jul 17 '20 at 20:51
7

I've finally figured out the answer to this question (after several years). All comments and answers suggested adding data.table to Depends or Imports, but this is incorrect; the package does not depend on data.table and, that could be any package hypothetically, not just data.table, meaning taken to logical conclusion, the suggestion would require adding all possible packages to Depends -- since that dependency is provided by the user providing the instruction, not by the function provided by the package.

Instead, basically, it's because call to eval is done within the namespace of the package, and this does not include the functions provided by other packages. I ultimately solved this by specifying the global environment in the eval call:

myFunc = function(instruction) {
eval(parse(text=instruction), envir=globalenv())
}

Why this works

This causes the eval function to be done in the environment that will include the requisite packages in the search path.

In the data.table case it's particularly hard to debug because of the complexity of the function overloading. In this case, the culprit is not actually the := function, but the [ function. The := error is a red herring. At the time of writing, the := function in data.table is defined like this:

https://github.com/Rdatatable/data.table/blob/348c0c7fdb4987aa6da99fc989431d8837877ce4/R/data.table.R#L2561

":=" <- function(...) stop('Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. See help(":=").')

That's it. What that means: any call to := as a function is stopped with an error message, because this is not how the authors intend := to be used. Instead, := is really just keyword that's interpreted by the [ function in data.table.

But what happens here: if the [ function isn't correctly mapped to the version specified by data.table, and instead is mapped to the base [, then we have a problem -- since it can't handle := and so it's getting treated as a function and triggering the error message. So the culprit function is [.data.table -- the overloaded bracket operator.

What's happening is in my new package (that holds myFuncInPackage), when it goes to evaluate the code, it resolves the [ function to the base [ function instead of to data.table's [ function. It tries to evaluate := as a function, which is not being consumed by the [ since it's not the correct [, so := is getting passed as a function instead of as a value to data.table's, because data.table is not in the namespace (or is lower in the search() hierarchy. In this setting, := is not understood and so it's being evaluated as a function, thus triggering the error message in the data.table code above.

When you specify the eval to happen in the global environment, it correctly resolves the [ function to [.data.table, and the := is interpreted correctly.

Incidentally, you can also use this if you're passing not a character string but a code block (better) to eval() inside a package:

eval(substitute(instruction), envir=globalenv())

Here, substitute prevents the instruction from being parsed (incorrectly) within the package namespace at the argument-eval stage, so that it makes it intact back to the globalenv where it can be correctly evaluated with the required functions in place.

nsheff
  • 3,063
  • 2
  • 24
  • 29