5

This is an extremely simple question, but again I'm confounded by the data.table syntax.

If I have a string representing a column name -- such as column <- "x" -- how do I return just the rows that match a logical condition on that column?

In a data.frame, if I wanted to return all rows of the table where column x equaled 1, I'd write df[df[,column] == 1,].

How do I write that efficiently in a data.table?

(Note, dt[x == 1] works fine, but not if you use a string like column representing the name of that column.)

The answers here are close but do not seem to be enough to answer this question.

Community
  • 1
  • 1
canary_in_the_data_mine
  • 2,193
  • 2
  • 24
  • 28

2 Answers2

4

dt[get(column) == 1] seems to work -- is that the most efficient method?

canary_in_the_data_mine
  • 2,193
  • 2
  • 24
  • 28
  • 4
    +1 In terms of syntax that's nice, yes. Or `dt[ dt[[column]]==1 ]` not quite as elegant. For speed, maybe `setkeyv(dt,column); dt[.(1)]`. But on just a single column, the time to `setkey` isn't usually worth it since a single vector scan is pretty quick. Unless you're looking up many times in which case `setkey` could be worth it even on only one column. – Matt Dowle Jan 28 '14 at 01:45
  • 2
    This is generally the syntax I use as well. – Ricardo Saporta Jan 28 '14 at 02:02
2

One way of doing this:

dt[eval(as.name(column)) == 1, ]

See section 1.6 of the FAQ on how one could create expressions and evaluate them within the frame of dt (although the FAQ explains it in the context of j, constructing expressions and evaluating them is also valid in the context of i, as shown above).

Arun
  • 116,683
  • 26
  • 284
  • 387
BrodieG
  • 51,669
  • 9
  • 93
  • 146
  • @Arun, They do, but isn't it equally applicable to `i`? What I did here works, and follows the same patterns. – BrodieG Jan 28 '14 at 02:03