89

I am trying to use the data.table package inside my own package. MWE is as follows:

I create a function, test.fun, that simply creates a small data.table object, and then sums the "Val" column grouping by the "A" column. The code is

test.fun<-function ()
{
    library(data.table)
    testdata<-data.table(A=rep(seq(1,5), 5), Val=rnorm(25))
    setkey(testdata, A)
    res<-testdata[,{list(Ct=length(Val),Total=sum(Val),Avg=mean(Val))},"A"]
    return(res)
}

When I create this function in a regular R session, and then run the function, it works as expected.

> res<-test.fun()
data.table 1.8.0  For help type: help("data.table")
> res
     A Ct      Total        Avg
[1,] 1  5 -0.5326444 -0.1065289
[2,] 2  5 -4.0832062 -0.8166412
[3,] 3  5  0.9458251  0.1891650
[4,] 4  5  2.0474791  0.4094958
[5,] 5  5  2.3609443  0.4721889

When I put this function into a package, install the package, load the package, and then run the function, I get an error message.

> library(testpackage)
> res<-test.fun()
data.table 1.8.0  For help type: help("data.table")
Error in `[.data.frame`(x, i, j) : object 'Val' not found

Can anybody explain to me why this is happening and what I can do to fix it. Any help is very much appreciated.

Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
ruser
  • 1,579
  • 2
  • 13
  • 12
  • 15
    My guess is that you haven't declared a dependency. You should remove `library(data.table)` from your function, and declare `depends:data.table` in your namespace and DESCRIPTION. – Andrie May 10 '12 at 04:52
  • 1
    There is also now the `.datatable.aware = TRUE` option to handle this issue, as discussed in [this](https://github.com/Rdatatable/data.table/issues/2341#issuecomment-328084921) issue and in the [vignette](https://cran.r-project.org/web/packages/data.table/vignettes/datatable-importing.html) linked below. Declaring `Depends: data.table` will attach the whole package to the search path, which is sometimes [discouraged](https://r-pkgs.org/namespace.html#search-path). – Niels Feb 18 '21 at 17:57

2 Answers2

99

Andrie's guess is right, +1. There is a FAQ on it (see vignette("datatable-faq")), as well as a new vignette on importing data.table:

FAQ 6.9: I have created a package that depends on data.table. How do I ensure my package is data.table-aware so that inheritance from data.frame works?

Either i) include data.table in the Depends: field of your DESCRIPTION file, or ii) include data.table in the Imports: field of your DESCRIPTION file AND import(data.table) in your NAMESPACE file.

Further background ... at the top of [.data.table (and other data.table functions), you'll see a switch depending on the result of a call to cedta(). This stands for Calling Environment Data Table Aware. Typing data.table:::cedta reveals how it's done. It relies on the calling package having a namespace, and, that namespace Import'ing or Depend'ing on data.table. This is how data.table can be passed to non-data.table-aware packages (such as functions in base) and those packages can use absolutely standard [.data.frame syntax on the data.table, blissfully unaware that the data.frame is() a data.table, too.

This is also why data.table inheritance didn't used to be compatible with namespaceless packages, and why upon user request we had to ask authors of such packages to add a namespace to their package to be compatible. Happily, now that R adds a default namespace for packages missing one (from v2.14.0), that problem has gone away :

CHANGES IN R VERSION 2.14.0
* All packages must have a namespace, and one is created on installation if not supplied in the sources.

jangorecki
  • 16,384
  • 4
  • 79
  • 160
Matt Dowle
  • 58,872
  • 22
  • 166
  • 224
  • (Sorry to revive this, but...) Matthew, can you clarify how this would work from an interactive standpoint? If my package returns a `data.table` to a user in an interactive session, will they be required to use the `data.table` semantics, or is there some way I could support the familiar `data.frame` syntax? – Jeff Allen Apr 23 '13 at 19:04
  • 1
    @JeffAllen That's a new one...not sure. If your package Depends on data.table then that'll make the user data.table aware I guess. Maybe Importing data.table wouldn't (and maybe that's what you'd like). – Matt Dowle Apr 23 '13 at 20:52
  • Thanks Matt! This solved my problem after half a day of failures and searching the net. I had put data.table only in the Imports. The code worked well in R but not from within the package. Moved to Depends and it works! – Oskar Hansson Jan 04 '15 at 12:20
  • 4
    @OskarHansson Glad Depends works but Import should work as long as you had *both* Imports in DESCRIPTION *and* `import(data.table)` in NAMESPACE as well? – Matt Dowle Jan 05 '15 at 10:41
  • 3
    @MattDowle You are right. I got a NOTE when using Depends. I changed back to Imports + added `@import data.table` in the code so Roxygen adds `import(data.table)` in the NAMESPACE. – Oskar Hansson Jan 05 '15 at 12:45
  • I had a problem with precisely the same symptoms but the above did not work. For whatever reason, `@import dtplyr` did help. Would be interested in knowing why. – Jim Aug 03 '17 at 18:32
  • @Jim Having to guess I'd guess it's the `roxygen2` extra layer (`@import`) not quite translating to precisely above. Have a look at `roxygen2`'s output to see if it has produced exactly above. – Matt Dowle Aug 03 '17 at 21:07
39

Here is the complete recipe:

  1. Add data.table to Imports in your DESCRIPTION file.

  2. Add @import data.table to your respective .R file (i.e., the .R file that houses your function that's throwing the error Error in [.data.frame(x, i, j) : object 'Val' not found).

  3. Type library(devtools) and set your working directory to point at the main directory of your R package.

  4. Type document(). This will ensure that your NAMESPACE file includes a import(data.table) line.

  5. Type build()

  6. Type install()

For a nice primer on what build() and install() do, see: http://kbroman.org/pkg_primer/.

Then, once you close your R session and login next time, you can immediately jump right in with:

  1. Type library("my_R_package")

  2. Type the name of your function that's housed in the .R file mentioned above.

  3. Enjoy! You should no longer receive the dreaded Error in [.data.frame(x, i, j) : object 'Val' not found

Richard Erickson
  • 2,568
  • 8
  • 26
  • 39
warship
  • 2,924
  • 6
  • 39
  • 65
  • I followed these instructions and I getting `function not found`. I could not find anything similar so I created a question https://stackoverflow.com/questions/56720520/creating-custom-r-package-with-data-table-custom-function – Kill3rbee Lee Mtoti Jun 23 '19 at 07:31
  • 2
    Thank you for this. You helped me see what to do in an easy to read format. – Richard Erickson Nov 24 '20 at 17:37
  • Answers like these are the best! In my case, though, I still had to add `.datatable.aware <- TRUE` somewhere in the package for this to work. – Eden Jun 12 '23 at 02:31