To summarize the answer to this question, and also to make it
a) negation-friendly (so that you can also select columns by negation),
b) pipe-line friendly (so that you can use in a pipeline with %>% operator), and
c) so that you can select using both column numbers and column names,
these are available options:
library(data.table);
select1 <- function (dt, range) dt[, range, with=F]
select2 <- function (dt, range) dt[, ..range]
select3 <- function (dt, range) dt[, .SD, .SDcols=range]
dt <- ggplot2::diamonds
range <- 1:3 # or
range <- dt %>% names %>% .[1:3]
dt %>% select1(range);
dt %>% select2(range);
dt %>% select3(range);
dt %>% select1(-range);
dt %>% select2(-range);
dt %>% select3(-range); # DOES NOT WORK
Also we note that this
dt %>% .[, ..(names(dt)[1:3])] # DOES NOT WORK
Therefore the best (most universal and fast) way to select multiple columns in data.table
is the following:
# columns are selected using column numbers:
range <- 1:3
dt %>% select1(range);
dt %>% .[, range, with=F]
# The same works if columns are selected using column names:
range <- names(dt) [1:3]
dt %>% select1(range);
dt %>% .[, range, with=F]
PS.
If, instead of selecting multiple columns, you want to efficiently delete multiple columns from data.table by reference (i.e. instead of copying the entire data.table), then you can use data.table's :=
operator. But I don't know how to do it for multiple columns in one line