0

As a new user to R I was trying to figure out how to subset over a range of columns. I found a workable answer but when I tried what I thought would be a more concise answer I got an error. Here's what I tried that worked:

df <- data.frame(a = 1:10, b =10:1, c = 1:10, d = 1:10, e = 1:10)
subset(df, select = colnames(df)[1])
    a
1   1
2   2
3   3
4   4
5   5
6   6
7   7
8   8
9   9
10 10
df[c(colnames(df)[1])] # Same result

Since the above worked fine I thought I could do the following:

subset(df, select = colnames(df)[1]:colnames(df)[3])
df[c(colnames(df)[1]:colnames(df)[3])]

They both throw the following error:

Error in colnames(df)[1]:colnames(df)[3] : NA/NaN argument
In addition: Warning messages:
1: In eval(substitute(select), nl, parent.frame()) :
  NAs introduced by coercion
2: In eval(substitute(select), nl, parent.frame()) :
  NAs introduced by coercion

Why doesn't R allow me to subset over a range of column names using their index?

otteheng
  • 594
  • 1
  • 9
  • 27
  • If you are trying to select by column index, use `subset(df, select = 1:3)`. If you want to select by name, use `subset(df, select = a:c)`. – MrFlick Dec 03 '19 at 16:43
  • That works. Don't know why I didn't think of that. You can get the same result using `df[,1:3]`. Is there any advantage to using one or the other? – otteheng Dec 03 '19 at 16:46
  • Well, you can't use `df[, a:c]`. Normally R doesn't recognize named ranges, but the `select()` function does some extra work to make that possible. Most people would probably prefer `df[,1:3]` if you are using indexes. – MrFlick Dec 03 '19 at 16:48
  • That makes sense. Thanks! – otteheng Dec 03 '19 at 16:52

0 Answers0