1

I need to select specific columns of a data.table with a vector of column names or positions.

library(data.table)
DT <- data.table(cbind(A=rnorm(50),B=rnorm(50),C=rnorm(50),D=rnorm(50)))

Indexing column's "A" and "C" works well this way.

DT[,c("A","C")]

but if i specify a variable and try to index it fails.

mycols <- c("A","C")
DT[,mycols]

I am forced to use with=FALSE but i dont want to, because with=FALSE treats DT like a data.frame and i loose all the performance advantages (speed) of data.table.

My questions are. Why does data.table accept a vector of characters the former way but not the latter? Is there a solution that preserves the performance advantages of data.table?

Thanks

IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38

1 Answers1

2

An option is to use double dots

DT[, ..mycols]
#          A           C
#1:  0.1188208 -0.17328827
#2: -0.5622505  0.84231231
#3:  0.8111072 -1.59802306
#4:  0.7968823  2.08468489
# ...

Or specify it in .SDcols

DT[, .SD, .SDcols = mycols]

or else with = FALSE as the OP mentioned in the post

akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    That worked! will accept when i am able. Any insight into the first question as to why this happens? – JustGettinStarted Nov 07 '19 at 19:09
  • 1
    @JustGettinStarted I guess it is a design check where it is searching for the column name 'mycols' in the data.table environment and not looking for objects outside the env i.e. on global env. By doing the `..`, it is like `!!` in tidyverse to evaluate outside that env – akrun Nov 07 '19 at 19:11