(R, Data.Tables): Subset rows based on logical values in columns with dynamically assigned column names

Question

I have a data table with two columns named based on variables. I'm a touch new to the quirks of the data.tables package, but I've gotten something like the following code to work so far...

varNames <- c("Subtype", ...)

for (i in length(varNames)) {

  nm1 <- (paste0(varNames[i],"1"))
  nm2 <- (paste0(varNames[i],"2"))
  
  DT[,(nm1):= x1]
  DT[,(nm2):= x2]
  
  #A BUNCH OF OTHER CODE GOES HERE...

}

I want to single out the rows where columns named nm1 and columns named nm2 match, but I know I can't just do this...

nmMatch <- (paste0(varNames[i],"Match"))
DT[, (nmMatch) := F ]
DT[(nm1)==(nm2), (nmMatch) := T] #Returns empty data table :^(

I think this is either because there are no columns actually named "nm1" or "nm2" or because the variable named nm1 does not equal the variable named nm2.

If I didn't need to assign these based on a vector of character values, I would write this to get what I'm looking for...

DT[, "SubtypeMatch" := F]    
DT[(Subtype1) == (Subtype2), SubtypeMatch := T]

How do I get a subset of rows based on column values if I need to reference those column names through variables? Is there a way to do that for data tables? These end up being huge structures (> 1000000 rows), so any work arounds using sapply() end up being prohibitively slow.

I recognize that there may be ways that I could fundamentally restructure my code so that I never really need to do this, and I'm happy to hear those, but I'm also interested in any "Proper" way to accomplish this subsetting task with data.tables.

Does this answer your question? [How can one work fully generically in data.table in R with column names in variables](https://stackoverflow.com/questions/24833247/how-can-one-work-fully-generically-in-data-table-in-r-with-column-names-in-varia) — jangorecki, Dec 08 '20 at 11:35

score 0 · Accepted Answer · answered Dec 08 '20 at 02:39

0

Use get :

library(data.table)
DT[, (nmMatch) := FALSE ]
DT[get(nm1)== get(nm2), (nmMatch) := TRUE]

answered Dec 08 '20 at 02:39

Ronak Shah

377,200
20
156
213

Thanks for this! I've never really needed to use get, so I wasn't thinking of it. That works – cchato Dec 08 '20 at 12:37

(R, Data.Tables): Subset rows based on logical values in columns with dynamically assigned column names

1 Answers1