6

I have to make an R package that depends on the package data.table. However, if I would do a function such as the next one in the package

randomdt <- function(){
    dt <- data.table(random = rnorm(10))
    dt[dt$random > 0]
}

the function [ will use the method for data.frame not for data.table and therefore the error

Error in `[.data.frame`(x, i) : undefined columns selected

will appear. Usually this would be solved by using get('[.data.table') or similar method (package::function is the simplest) but that appears not to work. After all, [ is a primitive function and I don't know how the methods to it work.

So, how can I call the data.table [ function from my package?

Usobi
  • 1,816
  • 4
  • 18
  • 25
  • You probably have to make sure that `data.table` is loaded when your package is loaded. – Jaap Oct 09 '15 at 13:15
  • 2
    Add `Depends: data.table` in your `Description` file. – Soheil Oct 09 '15 at 13:20
  • 4
    Have you read [FAQ 6.9](https://cran.r-project.org/web/packages/data.table/vignettes/datatable-faq.pdf)? – Arun Oct 09 '15 at 13:25
  • Arun nailed it, but just as a reminder that this is common in other packages, take [`xlsx`](https://cran.r-project.org/web/packages/xlsx/xlsx.pdf), which depends on `rJava` and `xlsxjars`; every time you run `library(xlsx)` or `require(xlsx)`, you'll see that `R` attaches those packages as well. – MichaelChirico Oct 09 '15 at 13:30
  • When I have had this problem before, I go the route of `Depends:` in the description file. It seems the simplest solution to me. – alexwhitworth Oct 09 '15 at 14:15
  • So, if I understand @Arun @MichaelChirico @Alex. Depends will attach the package data.table to the search path as I would have done `library(data.table)`. The data.table namespace will also be available in all enviroments below it because is loded in the search path. This strategy will work as long as the search path has a right order. – Usobi Oct 09 '15 at 19:04

1 Answers1

6

Updated based on some feedback from MichaelChirico and comments by Arun and Soheil.

Roughly speaking, there's two approaches you might consider. The first is building the dependency into your package itself, while the second is including lines in your R code that test for the presence of data.table (and possibly even install it automatically if it is not found).

The data.table FAQ specifically addresses this in 6.9, and states that you can ensure that data.table is appropriately loaded by your package by:

Either i) include data.table in the Depends: field of your DESCRIPTION file, or ii) include data.table in the Imports: field of your DESCRIPTION file AND import(data.table) in your NAMESPACE file.

As noted in the comments, this is common R behavior that is in numerous packages.

An alternative approach is to create specific lines of code which test for and import the required packages as part of your code. This is, I would contend, not the ideal solution given the elegance of using the option provided above. However, it is technically possible.

A simple way of doing this would be to use either require or library to check for the existence of data.table, with an error thrown if it could not be attached. You could even use a simple set of conditional statements to run install.packages to install what you need if loading them fails.

Yihui Xie (of knitr fame) has a great post about the difference between library and require here and makes a strong case for just using library in cases where the package is absolutely essential for the upcoming code.

Community
  • 1
  • 1
TARehman
  • 6,659
  • 3
  • 33
  • 60
  • 1
    Why the downvote? Comment on how to improve the answer, maybe? – TARehman Oct 09 '15 at 13:45
  • @MichaelChirico A fair set of suggestions. I have made an effort to expand on this and to link some of the items that were mentioned by Arun and Soheil. If you have any additional suggestions, I'm happy to revise further. – TARehman Oct 09 '15 at 13:58