3

updated with example: I have a function as follows:

myfun <- function(DT, var){  
  for(i in 1:length(var)){
    s = substitute(!(is.na(x) | is.nan(x)), list(x=as.symbol(eval(var[i]))))
    DT = DT[eval(s)]
  }
  return(DT)
}

input:

> dt = data.table(id=c(1,2,3,4,5), x=c(1,2,NA,4,5), y=c(1,NA,3,4,NA))
> dt
   id  x  y
1:  1  1  1
2:  2  2 NA
3:  3 NA  3
4:  4  4  4
5:  5  5 NA

runs:

> myfun(dt, var=c("x", "y"))
   id x y
1:  1 1 1
2:  4 4 4
> myfun(dt, var=c("x"))
   id x  y
1:  1 1  1
2:  2 2 NA
3:  4 4  4
4:  5 5 NA

var is an character array of some variables in DT. The goal is to only obtain rows in DT which do not have any NA or NaN wrt to any of variables in var.

I do not WANT the for loop. I want to construct a query s with all the conditions and then evaluate that query for DT. for the first case I want:

s = !(is.na(x) | is.nan(x) | is.na(y) | is.nan(y))

and for the second case I want:

s = !(is.na(x) | is.nan(x))

How can I construct a dynamic query s and just run it once as an i/where query in the data table.

More generally how can I create dynamic expression based on input. Using expression(paste()) did not help me. Then I can use substitute.

user1971988
  • 845
  • 7
  • 22
  • Maybe give us example input and its associated expected output? – Frank Sep 04 '13 at 13:56
  • @Frank: Updated with an example. – user1971988 Sep 04 '13 at 14:07
  • Never mind. I found it in: http://stackoverflow.com/questions/11677424/how-to-use-an-unknown-number-of-key-columns-in-a-data-table?rq=1 str=paste0("is.na(",var,") |", " is.nan(",var,")", collapse="|") s = parse(text=paste("!(",str,")")) dt[eval(s)] – user1971988 Sep 04 '13 at 14:27
  • You should probably move your answer to the "Your Answer" box below. Answering your own question is actually encouraged: http://stackoverflow.com/help/self-answer – Frank Sep 04 '13 at 14:34
  • OK. I have added it. need to understand more about parse. – user1971988 Sep 04 '13 at 14:41

1 Answers1

3

Ans:

var = c("x","y")
str=paste0("is.na(",var,") |", " is.nan(",var,")", collapse="|")
s = parse(text=paste("!(",str,")"))
DT[eval(s)]

source: How to use an unknown number of key columns in a data.table

Community
  • 1
  • 1
user1971988
  • 845
  • 7
  • 22
  • 3
    +1 This works as long as "s" isn't a column name. If "s" might ever be a column name in future, use ".s" instead of "s", or build the entire query (including the "DT[" bit) and eval that as a whole. – Matt Dowle Sep 04 '13 at 15:25
  • @MatthewDowle - Thanks! That is a very important point! Do you have any pointers about the detailed structure of an expression in R. There is some mention of it in the Writing R Extensions doc, but any detailed info will be appreciated. – user1971988 Sep 04 '13 at 17:34
  • NP. Not sure what you mean by detailed structure. `.Internal(inspect(expression(1+2+3)))` ? – Matt Dowle Sep 04 '13 at 17:40
  • In 5.11 Evaluating R expressions from C (http://cran.r-project.org/doc/manuals/R-exts.html#Evaluating-R-expressions-from-C), it says when constructing an expression: "There are three steps: the call is constructed as a pairlist of length 3, the list is filled in, and the expression represented by the pairlist is evaluated." The example then shows the list being filled it. It is kind of vague for beginners, but the .Internal with inspect gives me some hope :) Thanks! – user1971988 Sep 04 '13 at 17:48