2

I'm trying to get learn the data.table package, which seems fantastic.

One behaviour I'm unable to google my way to understanding is the following: I want to use a function to create a variable by reference. If the data input is a data.frame I also want to make it a data.table, and this works - but the variable defined is only created if the data input is already a data.table. This seems strange to me, anyone knows the reason for this?

E.g. after running the bar function (which makes the input a data.table if it isn't already, and then defines a variable bar) on the data.frame X, X will be a data.table but the variable bar is not created. However, on Y (X coerced to a data.table), the variable bar is created. Why isn't the variable bar created in X?

library(data.table)
bar <- function(x){
    if(!"data.table" %in% class(x)) setDT(x)
    x[, bar := 1:.N]
    invisible(NULL)
}
X <- data.frame(foo = letters[1:2])
Y <- as.data.table(X)
str(X)
## 'data.frame':    2 obs. of  1 variable:
##  $ foo: chr  "a" "b"
bar(X)
str(X)
## Classes 'data.table' and 'data.frame':   2 obs. of  1 variable:
##  $ foo: chr  "a" "b"
str(Y)
## Classes 'data.table' and 'data.frame':   2 obs. of  1 variable:
##  $ foo: chr  "a" "b"
##  - attr(*, ".internal.selfref")=<externalptr> 
bar(Y)
str(Y)
## Classes 'data.table' and 'data.frame':   2 obs. of  2 variables:
##  $ foo: chr  "a" "b"
##  $ bar: int  1 2
##  - attr(*, ".internal.selfref")=<externalptr> 

I'm using R 4.2.0 and data.table 1.14.2

UPDATE: this is essentially the same question as in

Using setDT inside a function

and the answer provided there clarifies the issue. Thanks to Waldi for pointing me to this.

M--
  • 25,431
  • 8
  • 61
  • 93

0 Answers0