4

Let's say I have a data table

library(data.table)
DT <- data.table(x=c(1,1,0,0),y=c(0,1,2,3))
column_name <- "x"

   x y
1: 1 0
2: 1 1
3: 0 2
4: 0 3

And I want to access all the rows where x = 1, but by using column_name.

The desired output should behave like this:

DT[x==1,]
   x y
1: 1 0
2: 1 1

but with x replaced by column_name in the input.

Note that this problem is similar to but not quite the same as Select subset of columns in data.table R, and the solution there (using with=FALSE) doesn't work here.

Here are all the things I've tried. None of them work.

DT[column_name ==1,]
DT[.column_name ==1,]
DT[.(column_name) ==1,]
DT[..column_name ==1,]
DT[."column_name" ==1,]
DT[,column_name ==1,]
DT[,column_name ==1,with=TRUE]
DT[,column_name ==1,with=FALSE]
DT[,.column_name ==1,with=TRUE]
DT[,.column_name ==1,with=FALSE]
DT[,..column_name ==1,with=TRUE]
DT[,..column_name ==1,with=FALSE]
DT[,."column_name" ==1,with=TRUE]
DT[,.column_name ==1,with=FALSE]
DT[column_name ==1,with=TRUE]
DT[column_name ==1,with=FALSE]
DT[[column_name==1,]]
subset(DT,column_name==1)

I also have options(datatable.WhenJisSymbolThenCallingScope=TRUE) enabled

There's obviously some kind of lexical trick I'm missing. I've spent several hours looking through vignettes and SO questions to no avail.

Ingolifs
  • 290
  • 2
  • 14
  • Maybe a duplicate of [Assigning/Referencing a column name in data.table dynamically (in i, j and by)](https://stackoverflow.com/questions/60818053/)? – Ian Campbell Jul 09 '20 at 04:18

4 Answers4

5

I can imagine this was very frustrating for you. I applaud the number of things you tried before posting. Here's one approach:

DT[get(column_name) == 1,]
   x y
1: 1 0
2: 1 1

If you need to use column_name in J, you can use get(..column_name):

DT[,get(..column_name)]
[1] 1 1 0 0

The .. instructs evaluation to occur in the parent environment.

Another approach for using a string in either I or J is with eval(as.name(column_name)):

DT[eval(as.name(column_name)) == 1]
   x y
1: 1 0
2: 1 1

DT[,eval(as.name(column_name))]
[1] 1 1 0 0
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • While I was trying several different attempts (including the `..` notation but in `i` not `j`), I know this works well-enough but find its necessity somewhat frustrating. – r2evans Jul 09 '20 at 04:09
  • 1
    I've never been fond of `eval` solutions, though I understand they can bring value in otherwise difficult situations. I'm just waiting to see if R-injection is ever a concern, and [*Little `R`obbie Tables*](https://xkcd.com/327/) is the next thing (though `eval(parse(...))` is the real concern there). – r2evans Jul 09 '20 at 04:17
  • Thanks! I had so many tries before posting on SO, because, like I suspect many others find, it can be a bit daunting and you feel like you need to have an airtight question. The `get` function is weird. I feel like it somehow oughtn't to exist. In 3 years of R I haven't come across it before. – Ingolifs Jul 09 '20 at 04:58
  • 1
    Or with `DT[DT[, ..column_name][[1]]==1]` – akrun Jul 09 '20 at 21:19
2

You can subset the column by name and then select rows.

library(data.table)
DT[DT[[column_name]] == 1]

#   x y
#1: 1 0
#2: 1 1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

A little caveat, using get() directly with paste0() doesn't work. You have to assign the paste to a variable first, like:

# Doesn't work:
dt[get(paste0(column_name, 'some_string')) == 1]

# Does work:
this_col_name = paste0(column_name, 'some_string')
dt[get(this_col_name) == 1]
Kaleb Coberly
  • 420
  • 1
  • 4
  • 19
0

An additional answer I just discovered: If there are multiple columns named this way and you want to return all of them, don't use get, use mget.

Example:

df <- data.table(x=1:4,y=1:4,z=1:4,w=1:4)    # here's my data table
desired_columns <- c("y","z","w")            # I want to return only columns Y, Z and W

if I try:

> df[,get(desired_columns)]
Error in get(desired_columns) : first argument has length > 1

Instead:

> df[,mget(desired_columns)]
   y z w
1: 1 1 1
2: 2 2 2
3: 3 3 3
4: 4 4 4
Ingolifs
  • 290
  • 2
  • 14