1

I'm stumped from an error I get when I re-run some data.table code. What's peculiar is that it runs when initially loaded, but if I close and reopen R (R Studio), I have to reassign my data as a data.table to get it to work. Here's the reproducible example:

library(data.table)
set.seed(77)

#Generate City and Day
dtData2 <- data.table(expand.grid(City = c("Denver", "Seattle", "Chicago", "New York"),
                                  Day = seq(1, 365, 1)
                                  )
                      )

#Generate random Sales figures
dtData2[, Sales := floor(runif(.N)*101)] 

#Generate Query Table 
dtQT3 <- data.table(City = c("Denver", "Seattle", "Chicago", "New York", "Denver"),
                    SalesAtLeast = c(20L, 30L, 40L, 50L, 45L),
                    FromDay = c(100L, 50L, 50L, 100L, 100L),
                    UntilDay = c(200L, 200L, 100L, 350L, 200L)
                    )

#Query data against parameters
dtInterim <- dtData2[dtQT3,
                    on=.(City=City, Sales>=SalesAtLeast, Day>=FromDay, Day<=UntilDay),
                    .(City, x.Day, x.Sales,i.SalesAtLeast)]

#Summarize table
dtInterim[, .(.N, sum(x.Sales)), by=.(City,i.SalesAtLeast)]

This produces the result I want. Great!

But when I close R Studio and reopen it, although all the data tables I created earlier are still in my global environment, when I run the same commands without re-generating the data table...

library(data.table)
set.seed(77)

    #Query data against parameters
dtInterim <- dtData2[dtQT3,
                    on=.(City=City, Sales>=SalesAtLeast, Day>=FromDay, Day<=UntilDay),
                    .(City, x.Day, x.Sales,i.SalesAtLeast)]

I get an error that reads:

Error in set(nqx <- shallow(x), j = "_nqgrp_", value = nqgrp) : 
  Internal logical error. DT passed to assign has not been allocated enough column slots. l=3, tl=3, adding 1

To fix it, if I run:

dtData2 <- data.table(dtData2)

and then run the code immediately above, it works, but it doesn't seem like this should be a necessary step. It's as if dtData2 "forgot" it was a data table. Can someone please help explain what's happening here? Are the data tables not saving correctly? Maybe I haven't set some parameter correctly? Maybe it's a bug?

Thanks so much!!!

  • The question may be slightly different from the linked dupe, but that's the answer to this type of problem. See also https://github.com/Rdatatable/data.table/issues/1478 or http://stackoverflow.com/q/26069219/ or http://stackoverflow.com/q/37468946/ – Frank Mar 07 '17 at 13:54
  • Thanks @Frank for the speedy reply! I agree the solution is the same, and the underlying problem is the same, but the way I got there was different and there was no way to know it was the same (short of being an R data.table expert). Since I searched for the returned error and several combinations of data.table, r, joining, saving, re-opening, etc., and didn't get anywhere close to the answer, I believe other users may find this question valuable. I'll edit the original question to focus more on the "save, close, and reopen" aspect, since that seems to be what's causing the issue. – ColoradoGranite Mar 07 '17 at 15:04
  • Yeah, I didn't mean to say that you should have found the links. Fyi, your question doesn't disappear or get deleted thanks to being marked as a dupe. People can still find it and follow it to the linked question to find the answer. Also, having the answer posted only one place makes it easier for the data.table folks to keep it up to date; so I'd prefer not to open your question up to answers. – Frank Mar 07 '17 at 15:18
  • 1
    For others reading this, the solution is to use `alloc.col()` or `setDT()` on each data.table, which I believe is preferable to the `dtData2 <- data.table(dtData2)` I used above. And now that I know the problem, the base explanation is [here](https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-faq.html#reading-data.table-from-rds-or-rdata-file). – ColoradoGranite Mar 07 '17 at 15:20
  • 1
    @Frank. Got it. Thanks so much. That makes sense to keep the answer in one place. I'm pretty new to R, data.table, and posting / responding to questions, so I really appreciate the insight! – ColoradoGranite Mar 07 '17 at 15:24

0 Answers0