I have a function that makes an assignment by reference to an already-existing data.table. The function works when the data.table is created and input to the function, but it does not work when the dataset is saved to disk, re-loaded and then the same function is run. However, if we do setDT() on the newly loaded data.table (or we make an assignment) and then run the function, then the function works fine. What is going on here? Why do we need to 'reset' the data.table before using this function?
library(data.table)
file.path <- "H:/mydt.RDS"
make_col <- function(dt) {
dt[ , z := 1]
}
# this works
mydt <- data.table(a = 1:3)
make_col(mydt)
# but if we save and load the saved copy...
mydt <- data.table(a = 1:3)
saveRDS(mydt, file.path)
# this doesn't work
mydt <- readRDS(file.path)
make_col(mydt)
# but this works
mydt <- readRDS(file.path)
setDT(mydt)
make_col(mydt)
# and so does this
mydt <- readRDS(file.path)
mydt[ , b := 1]
make_col(mydt)
# and so does this
mydt <- readRDS(file.path)
mydt2 <- copy(mydt)
make_col(mydt2)