I'm trying to get learn the data.table
package, which seems fantastic.
One behaviour I'm unable to google my way to understanding is the following:
I want to use a function to create a variable by reference. If the data input is a data.frame
I also want to make it a data.table
, and this works - but the variable defined is only created if the data input is already a data.table
. This seems strange to me, anyone knows the reason for this?
E.g. after running the bar
function (which makes the input a data.table
if it isn't already, and then defines a variable bar
) on the data.frame
X, X will be a data.table
but the variable bar
is not created. However, on Y (X coerced to a data.table
), the variable bar
is created. Why isn't the variable bar
created in X?
library(data.table)
bar <- function(x){
if(!"data.table" %in% class(x)) setDT(x)
x[, bar := 1:.N]
invisible(NULL)
}
X <- data.frame(foo = letters[1:2])
Y <- as.data.table(X)
str(X)
## 'data.frame': 2 obs. of 1 variable:
## $ foo: chr "a" "b"
bar(X)
str(X)
## Classes 'data.table' and 'data.frame': 2 obs. of 1 variable:
## $ foo: chr "a" "b"
str(Y)
## Classes 'data.table' and 'data.frame': 2 obs. of 1 variable:
## $ foo: chr "a" "b"
## - attr(*, ".internal.selfref")=<externalptr>
bar(Y)
str(Y)
## Classes 'data.table' and 'data.frame': 2 obs. of 2 variables:
## $ foo: chr "a" "b"
## $ bar: int 1 2
## - attr(*, ".internal.selfref")=<externalptr>
I'm using R 4.2.0 and data.table 1.14.2
UPDATE: this is essentially the same question as in
and the answer provided there clarifies the issue. Thanks to Waldi for pointing me to this.