I seem to have found a situation where the update by reference of data.table does not work as expected and described in Understanding exactly when a data.table is a reference to (vs a copy of) another data.table.
If you load a data.table from a Rdata-file and do an update by reference via :=
the data.table is copied implicitly (i.e. it's memory-address is changed). That works as long as you do the update in the same environment.
But, if you do the update inside a function (e.g. f(dt)), the data.table dt
is not changed outside the function in the calling environment, because it was copied inside the function.
Here is a little example
# function definition
f1 <- function(dt,dtj){
dt[,c("C","D"):=(dt[dt.j,nomatch=0][,list(C,D)])]
}
# create, save, delete and then load from file
dt <- data.table(A=c("A","A","B"),B=1:3,key=c("A"))
dt.j <- data.table(A=c("A","B","C"),C=5:7,D=c("a","a","b"))
save(dt,file="~/test.Rdata")
rm(dt)
load(file="~/test.Rdata")
address(dt)
f1(dt,dt.j)
address(dt)
dt
The address of dt stays the same as does the data.table.
There is nothing wrong with the code of the function. If I omit the function and just do the update, it works, but it changes the address of the data.table
address(dt)
dt[,c("C","D"):=(dt[dt.j,nomatch=0][,list(C,D)])]
address(dt)
dt
I can cope with this by copying the data.table after loading.
What I'd like to know is if there are other situations apart from the above mentioned, where data.table shows this behavior.
Here's the information about R (i could also replicate this behavior on a windows machine)
R version 3.2.2 (2015-08-14)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Generic 22 (Generic)
locale:
[1] LC_CTYPE=de_DE.UTF-8 LC_NUMERIC=C LC_TIME=de_DE.utf8
[4] LC_COLLATE=de_DE.UTF-8 LC_MONETARY=de_DE.utf8 LC_MESSAGES=de_DE.UTF-8
[7] LC_PAPER=de_DE.utf8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.9.6
loaded via a namespace (and not attached):
[1] tools_3.2.2 chron_2.3-47