The FAQ states that the preferred way to add a new column to a data.table when programming is to use quote() and then eval(). But what if I want to add several columns at once? Playing around with this I came up with the following solution:
library(data.table)
DT <- data.table(V1=1:1000,
V2=2001:3000)
col.names <- c("V3","V4")
col.specs <- vector("list",2)
col.specs[[1]] <- quote(V1**2)
col.specs[[2]] <- quote((V1+V2)/2)
DT[,c(col.names) := lapply(col.specs,eval,envir=DT)]
which yields the desired result:
> head(DT)
V1 V2 V3 V4
1: 1 2001 1 1001
2: 2 2002 4 1002
3: 3 2003 9 1003
4: 4 2004 16 1004
5: 5 2005 25 1005
6: 6 2006 36 1006
My question is simply: is this the preferred method? Specifically, can someone think of a way to avoid specifying the environment in the lapply() call? If I leave it out I get:
> DT[,c(col.names) := lapply(col.specs,eval)]
Error in eval(expr, envir, enclos) : object 'V1' not found
It may be no big deal, but at least to me it feels a bit suspicious that the data table does not recognise its own columns. Also, if I add the columns one by one, there is no need to specify the environment:
> DT <- data.table(V1=1:1000,
+ V2=2001:3000)
> col.names <- c("V3","V4")
> col.specs <- vector("list",2)
> col.specs[[1]] <- quote(V1**2)
> col.specs[[2]] <- quote((V1+V2)/2)
> for (i in 1L:length(col.names)) {
+ DT[,col.names[i] := list(eval(col.specs[[i]]))]
+ }
> head(DT)
V1 V2 V3 V4
1: 1 2001 1 1001
2: 2 2002 4 1002
3: 3 2003 9 1003
4: 4 2004 16 1004
5: 5 2005 25 1005
6: 6 2006 36 1006