1

I have a large data frame with column headers and row names, i.e.

CountTable = read.table( Data, header=TRUE, row.names=1 )

head (CountTable)

        S1    S2    S3
Row1     9     8     2 
Row2   268   193   282
Row3   635   631   568
Row4     0     2     0
Row5    15     8    10
Row6   416   321   350
... etc

From which I would like to retrieve rows based on name. If I had only a few to retrieve I would use the square bracket function, e.g.

CountTable[c("Row1", "Row3", "Row6",]

        S1    S2    S3
Row1     9     8     2 
Row3   635   631   568
Row6   416   321   350

But as my data frame has >20,000 rows from which I would like to retrieve ~2000 by their name, this isn't very practical. My best thought was if there is a way of importing the ~2000 names from another file (for example, names.txt/.cvs) and creating an index vector, e.g.

[1] Row1 Row3 Row6 ... Row2000

That could be used to specify which rows to retrieve when creating a subset of my data?

Any solution would be greatly appreciated!

Community
  • 1
  • 1
Lippy
  • 373
  • 1
  • 2
  • 9

1 Answers1

3

If you're subsetting based on rownames, you should be doing something along the lines of

CountTable[rownames(CountTable) %in% c("row1", "row2", "row3"), ]

To construct a vector of rownames, you can use

paste0("row", 1:10)
[1] "row1"  "row2"  "row3"  "row4"  "row5"  "row6"  "row7"  "row8"  "row9" 
[10] "row10"
eddi
  • 49,088
  • 6
  • 104
  • 155
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • 1
    No need for `%in%`. By default rownames are acceptable for a selection vector so after `rows <- paste0(...)` one can just type: `CountTable[rows, ]` – IRTFM Jul 19 '13 at 23:48
  • Solved with > TESTV <- unlist(read.table(TEST,stringsAsFactors=FALSE)) >is.character(TESTV) >paste0(TESTV,1:10) >CountTable[TESTV,] Thanks very much! – Lippy Jul 20 '13 at 04:44
  • There goes my attempt to be explicit. :) – Roman Luštrik Jul 20 '13 at 06:35