3

I have a weird behavior, when running my R program on another machine. When I try to run a data.table join df1[df2] I get the error message

Error in `[.default`(x, i) : invalid subscript type 'list'

I assume that for some reason the R environment on the other machine does not find the data.table bracket function (Although I have loaded the library there).

To force R to use the bracket from data.table I would like to call the bracket function explicitly, but I can't find out how.

Here what I've tried

library(data.table)    
df1 <- data.frame(a = c("a1","a2","a3"), n = c(1,2,3), b = c(T,T,T))
df2 <- data.frame(a = c("a1","a2","a3"), n = c(1,2,3), b = c(F,T,F))

df1 <- data.table(df1)
df2 <- data.table(df2)
setkey(df1,a,n,b)
setkey(df2,a,n,b)

df1[df2] # produces `[.default`(x, i) : invalid subscript type 'list'

# my tries to call `[.data.table` explicitly all produce errors
`[.data.table`(df1, df2)
data.table::`[.data.table`(df1, df2)
data.table::`[`(df1, df2)

How can I use the bracket function from the data.table package explicitly?

EDIT:

OK, I'm trying to find the root cause of the error. I'm using R version 3.2.1,

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] data.table_1.9.2 mypackage_1.0    ROracle_1.1-10   DBI_0.2-7

loaded via a namespace (and not attached):
[1] plyr_1.8.1    reshape2_1.4  Rcpp_0.11.2   stringr_0.6.2

is.data.table gives TRUE on both, df1 and df2 just before calling df1[df2] (I'm debugging through the code).

The function that contains the codeline df1[df2] is called inside mypackage_1.0 (A package I'm developing). I have noticed, that if I run the code line by line, instead of calling my package function and debugging it, the code works as expected. So I assume there is something wrong with the package. In the DESCRIPTION file I only import the package data.table under "Suggests". Might it be related to that?

Fabian Braun
  • 3,612
  • 1
  • 27
  • 44
  • 5
    No, investigate why the error occurs. From the error message I would assume that `df1` is not a data.table. Please provide your `sessionInfo`. (I can't reproduce this on my machine.) – Roland Jan 12 '16 at 11:21
  • 2
    Also, before running `df1[df2]` (on your real data set), please check what `is.data.table(df1)` gives. I have a feeling you have some typo here. Btw, you can create your `data.table`s and setting the keys directly. Try `df1 <- data.table(a = c("a1","a2","a3"), n = c(1,2,3), b = c(T,T,T), key = "a,n,b")` – David Arenburg Jan 12 '16 at 11:23
  • 1
    But good effort at *trying* to make a reproducible example. I cannot recreate this error. At a guess maybe you typed `dt1` or `dt2` instead of `df1` or `df2` when creating `data.table` from your `data.frames`. Maybe use the `setDT(df1)` function instead? – Simon O'Hanlon Jan 12 '16 at 11:43
  • @Roland, David, Simon, thank you for the replies, I have done what you suggested and made an edit to the question. – Fabian Braun Jan 12 '16 at 12:31
  • Ok, I think the error is something similar to what is described here: http://stackoverflow.com/questions/33039194/making-a-package-in-r-that-depends-on-data-table I will try to solve it based on this question – Fabian Braun Jan 12 '16 at 12:37
  • 2
    Moving data.table from "Suggests" to "Depends" section of the DESCRIPTION file solved the issue – Fabian Braun Jan 12 '16 at 13:11
  • Data.table should be in "Imports" (and of course be imported in NAMESPACE). – Roland Jan 12 '16 at 15:44

1 Answers1

3

To long for a comment so posting as answer.
General comments related to your case.

  1. You can call [.data.table explicitly by calling not exported data.table function using ::: operator.

data.table:::`[.data.table`(x, i)

Using ::: is not a best practice, as it makes you responsible for a function which package author decided not to expose to users directly. You should keep that in mind, still the R CMD check will not raise an error or warning. According to Writing R Extensions:

Using foo:::f instead of foo::f allows access to unexported objects. This is generally not recommended, as the semantics of unexported objects may be changed by the package author in routine maintenance.

In my opinion if you develop and internal package which will be deployed with explicitly stated version of dependencies, it is pretty safe to use :::.

  1. Update your data.table version, 1.9.2 is pretty old release already.
  2. In your DESCRIPTION file use Imports data.table and don't forget to define imports in NAMESPACE file
  3. Debug your problematic machine with the following

if(is.data.table(df1) && is.data.table(df2)) df1[df2] else stop("not a data.table")
  1. Use sessionInfo() as one of your first step in debugging cross package issues to track attached packagess.
jangorecki
  • 16,384
  • 4
  • 79
  • 160
  • I will accept this answer, as valid answer to my initial question. Thanks for the additional advise. Please note that I solved my problem by moving data.table from "Suggests" to "Depends" section of the DESCRIPTION file of my package. – Fabian Braun Jan 12 '16 at 17:09
  • never new about the distinction of `:::`, thanks for the tip – MichaelChirico Jan 13 '16 at 06:40