1

I have a data.table with 55993 rows and 2923 columns, a subset looks like this:

            Name Description GTEX-N7MS-0007-SM-2D7W1 GTEX-N7MS-0008-SM-4E3JI GTEX-N7MS-0011-R10A-SM-2HMJK
 ENSG00000223972     DDX11L1                       0                       0                            0
 ENSG00000227232      WASH7P                     158                     166                          209
 ENSG00000243485  MIR1302-11                       0                       0                            4
 ENSG00000237613     FAM138A                       0                       0                            0
 ENSG00000268020      OR4G4P                       0                       0                            0
 ENSG00000240361     OR4G11P                       0                       0                            0

The Name column is unique so it can be used as the key:

setkey(dat,Name)

I have a list of 175 columns which I want to extract, for e.g. like this:

col.list <- c('GTEX-N7MS-0011-R10A-SM-2HMJK','GTEX-N7MS-0008-SM-4E3JI','GTEX-N7MS-0826-SM-2AXU2')

However, it is possible that the table does contain all the columns.

How do I extract all the rows from data.table, with all the existing columns which match those in col.list? I was thinking something on the lines of:

dat[,.(col.list)] 

but it doesn't work.

Mark Bertenshaw
  • 5,594
  • 2
  • 27
  • 40
Komal Rathi
  • 4,164
  • 13
  • 60
  • 98
  • 3
    Try `dat[, col.list, with=FALSE]`. – AntoniosK Sep 02 '15 at 16:26
  • 1
    Thank you! Can you move this to an answer so that I can accept it? – Komal Rathi Sep 02 '15 at 16:28
  • Also similar http://stackoverflow.com/q/11940605/1191259 & http://stackoverflow.com/q/15007979/1191259 – Frank Sep 02 '15 at 16:36
  • 2
    If your vector has names that aren't colnames, drop 'em by taking the intersection, `dat[, intersect(names(dat), col.list), with=FALSE]` Generally, not cool to change your question after you have an answer. – Frank Sep 02 '15 at 16:39
  • I did not know that the answer would throw an error until I tried it on the full dataset. – Komal Rathi Sep 02 '15 at 16:41

1 Answers1

2

Try dat[, ..col.list] .

The .. signals to data.table to look in the parent frame (i.e. the environment where dat is located) rather than within dat itself.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • There are some columns that it cannot find and is throwing an error. – Komal Rathi Sep 02 '15 at 16:32
  • 1
    Maybe those columns include some symbols that break the code? If that's the case you can think of changing the names somehow before creating the dataset, or have a process that changes the column names after the dataset creation. – AntoniosK Sep 02 '15 at 16:35
  • I have edited my question, the col.list has a new column name that is not present, and it throws an error. There is no special character in that. – Komal Rathi Sep 02 '15 at 16:37
  • 1
    I used your and Frank's suggestion and it worked. Thank you for your help. – Komal Rathi Sep 02 '15 at 16:43
  • 1
    Glad I helped. Good that @Frank spotted the duplicate. You might find some more info in the answers there for future reference. – AntoniosK Sep 02 '15 at 16:45