5

When subsetting a data.frame with asking for only one variable we get a vector. This is what we ask for, so it is not strange. However, in other situations (if we ask for more then one column), we get a data.frame object. Example:

> data <- data.frame(a=1:10, b=letters[1:10])
> str(data)
'data.frame':   10 obs. of  2 variables:
 $ a: int  1 2 3 4 5 6 7 8 9 10
 $ b: Factor w/ 10 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10
> data <- data[, "b"]
> str(data)
 Factor w/ 10 levels "a","b","c","d",..: 1 2 3 4 5 6 7 8 9 10

If I need my data object not to change it's type from data.frame no matter if it has only one variable, what do I have to do? The only thing that comes to my mind is:

data <- data[, "a"]
data <- as.data.frame(data)

...but this seems terribly redundant. Is there a better way, i.e. a way of saying "stay a data.frame, just give me a certain column"?

The problem is that I need:

  • to subset using vectors of variable names of different length
  • get data.frames with names unchanged as an output each time.
Tim
  • 7,075
  • 6
  • 29
  • 58
  • 1
    See e.g. [**here**](http://stackoverflow.com/questions/21025609/how-do-i-extract-a-single-column-from-a-data-frame-as-a-data-frame/21025639#21025639). – Henrik Jan 05 '15 at 21:23

1 Answers1

10

The best is to use list subsetting. All of these will return a data.frame:

data['a']

data[c('a')]

data[c('a', 'b')]

Using matrix subsetting, you would have to add drop = FALSE:

data[, 'a', drop = FALSE]
flodel
  • 87,577
  • 21
  • 185
  • 223