I'm stumped from an error I get when I re-run some data.table code. What's peculiar is that it runs when initially loaded, but if I close and reopen R (R Studio), I have to reassign my data as a data.table to get it to work. Here's the reproducible example:
library(data.table)
set.seed(77)
#Generate City and Day
dtData2 <- data.table(expand.grid(City = c("Denver", "Seattle", "Chicago", "New York"),
Day = seq(1, 365, 1)
)
)
#Generate random Sales figures
dtData2[, Sales := floor(runif(.N)*101)]
#Generate Query Table
dtQT3 <- data.table(City = c("Denver", "Seattle", "Chicago", "New York", "Denver"),
SalesAtLeast = c(20L, 30L, 40L, 50L, 45L),
FromDay = c(100L, 50L, 50L, 100L, 100L),
UntilDay = c(200L, 200L, 100L, 350L, 200L)
)
#Query data against parameters
dtInterim <- dtData2[dtQT3,
on=.(City=City, Sales>=SalesAtLeast, Day>=FromDay, Day<=UntilDay),
.(City, x.Day, x.Sales,i.SalesAtLeast)]
#Summarize table
dtInterim[, .(.N, sum(x.Sales)), by=.(City,i.SalesAtLeast)]
This produces the result I want. Great!
But when I close R Studio and reopen it, although all the data tables I created earlier are still in my global environment, when I run the same commands without re-generating the data table...
library(data.table)
set.seed(77)
#Query data against parameters
dtInterim <- dtData2[dtQT3,
on=.(City=City, Sales>=SalesAtLeast, Day>=FromDay, Day<=UntilDay),
.(City, x.Day, x.Sales,i.SalesAtLeast)]
I get an error that reads:
Error in set(nqx <- shallow(x), j = "_nqgrp_", value = nqgrp) :
Internal logical error. DT passed to assign has not been allocated enough column slots. l=3, tl=3, adding 1
To fix it, if I run:
dtData2 <- data.table(dtData2)
and then run the code immediately above, it works, but it doesn't seem like this should be a necessary step. It's as if dtData2 "forgot" it was a data table. Can someone please help explain what's happening here? Are the data tables not saving correctly? Maybe I haven't set some parameter correctly? Maybe it's a bug?
Thanks so much!!!