1

Say I have a data table and I want to calculate a new variable based on several conditions of the old variables like this:

library(data.table)
test <- data.table(a = c(1,1,0), b = c(0,1,0), c = c(1,1,1))

test[a==1 & b==1 & c==1,test2:=1]

But I actually have many more conditions (all combinations of the different variables) which also have a different length. I draw those from a list such as:

conditions<-list(c("a","b","c"), c("b","c"))

and then I want to loop through that list and build a character vector like this (with which I want to do something before deleting it and going to the next element of the list):

mystring <- paste0(paste0(conditions[[1]], collapse = "==1 & "), "==1")

But how can I use "mystring" inside the data.table? as.function() or get() or eval() don't seem to work. Something like:

test[mystring,test3:=1]

is what I'm looking for.

Uwe
  • 41,420
  • 11
  • 90
  • 134
Jakob
  • 1,325
  • 15
  • 31
  • Not clear what "mychars" is. – Frank Feb 21 '17 at 22:18
  • 1
    mystring, sorry, was an earlier version – Jakob Feb 21 '17 at 22:31
  • 1
    Ok,, well there's `test[eval(parse(text = mystring)), test3 := 1]`, but it's pretty strongly discouraged in data.table and R generally to fiddle with strings as code. Besides, data.table has better ways of handling this sort of filter-and-update stuff. You might want to look through its vignettes. – Frank Feb 21 '17 at 23:02
  • 1
    And also similar with bquote. Sadly, R's not lispy enough to write easy-to-understand macros, so it's generally discouraged unless you really need it. `myExp = parse(text=mystring) eval(bquote(test[.(myExp), test3 := 1]))` – Clayton Stanley Feb 22 '17 at 20:18
  • Perhaps, Matt Dowle's `EVAL` approach [here](http://stackoverflow.com/a/42433456/3817004) is what you are looking for? It creates an expression to be evaluated, "similar to constructing a dynamic SQL statement to send to a server". – Uwe Feb 26 '17 at 00:21
  • @UweBlock interesting yet still manipulating strings to build an expression. What I'd really want is a defmacro for R to manipulate symbols to build an expression. The problem is that R's syntax (along with other curly brace languages) is too complex to make that an easy task to build. – Clayton Stanley Mar 05 '17 at 01:06

1 Answers1

2

For the given use case, you may use join with on = to achieve the desired goal without having to create and evaluate complex strings of conditions.

Instead of

test[a==1 & b==1 & c==1, test2 := 1][]

we can write

test[.(1, 1, 1), on = c("a", "b", "c"), test2 := 1][]
#   a b c test2
#1: 1 0 1    NA
#2: 1 1 1     1
#3: 0 0 1    NA

Now, the OP had requested to loop over a list of conditions using lapply() "to do something". This can be achieved as follows

# create list of conditions for subsetting
col = list(c("a","b","c"), c("b","c"))
val = list(c(1, 1, 1), c(0, 1))
# loop over conditions
lapply(seq_along(col), function(i) test[as.list(val[[i]]), on = col[[i]], test2 := i])
#[[1]]
#
#[[2]]
#   a b c test2
#1: 1 0 1     2
#2: 1 1 1     1
#3: 0 0 1     2

Note that the output of lapply() is not used because test has been modified in place:

test
#   a b c test2
#1: 1 0 1     2
#2: 1 1 1     1
#3: 0 0 1     2
Uwe
  • 41,420
  • 11
  • 90
  • 134