91

How do I extract a column from a data.table as a vector by its position? Below are some code snippets I have tried:

DT<-data.table(x=c(1,2),y=c(3,4),z=c(5,6))
DT
#   x y z
#1: 1 3 5
#2: 2 4 6

I want to get this output using column position

DT$y 
#[1] 3 4
is.vector(DT$y)
#[1] TRUE

Other way to get this output using column position

DT[,y] 
#[1] 3 4
is.vector(DT[,y])
#[1] TRUE

This doesn't give a vector

DT[,2,with=FALSE]
#   y
#1: 3
#2: 4
is.vector(DT[,2,with=FALSE])
#[1] FALSE

Those two doesn't work:

DT$noquote(names(DT)[2]) # Doesn't work
#Error: attempt to apply non-function

DT[,noquote(names(DT)[2])] # Doesn't work
#[1] y

And this doesn't give a vector:

DT[,noquote(names(DT)[2]),with=FALSE] # Not a vector
#   y
#1: 3
#2: 4
is.vector(DT[,noquote(names(DT)[2]),with=FALSE])
#[1] FALSE
demongolem
  • 9,474
  • 36
  • 90
  • 105
Wet Feet
  • 4,435
  • 10
  • 28
  • 41

2 Answers2

117

A data.table inherits from class data.frame. Therefore it is a list (of column vectors) internally and can be treated as such.

is.list(DT)
#[1] TRUE

Fortunately, list subsetting, i.e. [[, is very fast and, in contrast to [, package data.table doesn't define a method for it. Thus, you can simply use [[ to extract by an index:

DT[[2]]
#[1] 3 4
Roland
  • 127,288
  • 10
  • 191
  • 288
  • Is it possible to maintain the data.table structure rather than convert to a vector? Does this for multiple columns? – mindlessgreen Jan 08 '16 at 21:23
  • 2
    ...and if you wished to subset the data on a specific number of rows alongside a particular column (e.g. in this instance column 2) , you add an additional set of square brackets at the front of the query. That is, if you want the first 10 rows of column 2 then... DT[1:10][[2]] Thanks this has made my code much faster! – Ben G Small Dec 05 '19 at 16:57
4

DT[,get(names(DT)[colNb])]

where colNb can be an integer (the desired column number) or a variable containing the column number.

lokxs
  • 171
  • 10