R's data.table package exhibits the operations by reference behavior when storing a vector of column names:
> library(data.table)
> dt <- data.table(x=1,y=2)
> vars <- names(dt)
> vars
[1] "x" "y"
> dt[, z:=3]
> vars
[1] "x" "y" "z"
I did not expect the object vars
to "update" like this (to contain column names in dt
that were created later). If I use vars <- copy(names(dt))
it doesn't update, as if storing column names operates by reference similar to making a copy of a whole data.table.
Other functions like nrow()
do not "update" like this.
My question is: when do I need to use copy
and when do I not? I originally thought it was only for copying whole data.tables, but this makes me wonder where else it will be needed.