When I create a data.table and save its columns names in an object, the elements of that object change if I modify the data.table by reference :=
, by adding more variables. I though that once an object is created in R it remains stable as long as it is not explicitly modified, but it seems to me that if an object was created from a data.table, it is also modified implicitly when the original data.table is modified explicitly. Is that correct? See my code below and the suggested solution.
I don't know if this is an error, but if not, I would like to understand the behavior of data.table and find a better solution to the one suggested.
library(data.table)
# create data.table with two variables
DT <- data.table(x = 1, y = 2)
# store the variables names in object
original_names <- names(DT)
# add one more variables
DT[, z := 3]
# new object with the name of the three variables
new_names <- names(DT)
# these two should NOT be identical, yet they are.
identical(original_names, new_names)
#> [1] TRUE
# solution
DT <- data.table(x = 1, y = 2)
# Create another data.frame with the minimum information
# necessary to save memory and still get the variable names.
# This is what I think is inefficient.
DF <- as.data.frame(DT[1,])
# store the variables names in object
original_names <- names(DF)
# add one more variable
DT[, z := 3]
# new object with the name of the three variables
new_names <- names(DT)
# These two are not identival anymore.
identical(original_names, new_names)
#> [1] FALSE
Created on 2023-07-02 with reprex v2.0.2