Note: The precise problem I hit in this question does not apply to recent versions of data table. If you want to do something like described in the title, check out the corresponding question in the package FAQ, 1.6 OK, but I don’t know the expressions in advance. How do I programatically pass them in?.
I have seen an answer that illustrates how to construct an expression to be evaluated in
DT[,j=eval(expr)]
I am using this with an assignment, ```:=`(mycol=my_calculation)``, and I'm wondering...
- How can I assign the name "mycol" dynamically?
- What is the correct way to let "my_calculation" take a dynamically-determined set of columns?
By "dynamically", I mean "determined after I write the code for my expr
".
New example
EDIT: To better illustrate the issue, here is different example. Look in the edit history to see the original.
require(data.table)
require(plyr)
options(datatable.verbose=TRUE)
DT <- CJ(a=0:1,b=0:1,y=2)
# setup:
expr <- as.quoted(paste(expression(get(col_in_one)+get(col_in_two))))[[1]]
# usage:
col_in_one <- 'a'
col_in_two <- 'b'
col_out <- 'bah'
DT[,(col_out):=eval(expr)] # fails, should take the form j=eval(expr)
I want to keep the setup and usage stages separate, so my code is easier to maintain. My real expression is messier than this example (where it just chooses one column).
Questions
First question: How can I make the assigned-to column, "col_out", dynamic? I mean: I want to specify both "cols_in_*" and "col_out" on the fly.
I have tried creating various expressions in "expr", but as.quoted
throws an error about not putting certain stuff to the left of the =
symbol.
Second question: How can I avoid the warnings against using
get
?
The warnings suggest using .SDcols
, to let [.data.table
know which columns I am using. However, if I use the .SDcols
argument, another warning says there's no point doing that unless .SD
is being used.
Tentative solution
The solutions I have so far are...
# Ricardo + eddi:
expr2 <- as.quoted(paste(expression(`:=`(
Vtmp=.SD[[col_in_one]]+.SD[[col_in_two]]))))[[1]]
# usage
col_in_one <- 'a'
col_in_two <- 'b'
col_out <- 'bah'
DT[,eval(expr2),.SDcols=c(col_in_one,col_in_two)]
setnames(DT,'Vtmp',col_out)
This still involves the minor annoyance of doing the operation in two steps and keeping track of "Vtmp", so the first question is still partly open.