Select a consecutive range of data.frame columns using names of beginning and end columns

Question

I am trying to subset the columns of a data.frame using the interval of column names.

For instance, the data.frame A:

A
ID1 ID2 ID3
1   5  01901
2   5  01902

For example, I want create variable b with the columns of A:

b=A[,"ID2":"ID3"]

Error in "ID1":"ID3" : NA/NaN argument In addition: Warning messages: 1: In [.data.frame(A, , "ID1":"ID3") : NAs introduced by coercion 2: In [.data.frame(A, , "ID1":"ID3") : NAs introduced by coercion

What I want how solution:

When I put the indexes of the columns, it works. But when I use the column name, as above, does not work.

I think the difference here is in the column range bit. The @Sotos, which is also the linked duplicate, is a bit incorrect if the intent is to span from `"IDx":"IDy"` — coatless, Jun 05 '16 at 21:51
I agree with @Coatless that this is not a duplicate, and I've edited the question to make that clearer. Voting to reopen. — Sam Firke, Oct 16 '16 at 18:50

score 9 · Answer 1 · answered Jun 05 '16 at 21:32

Two approaches in base R's data.frame:

Named vector column subset
Interval approach

Named vector column subset

First, subset by known name:

b = A[, c('ID2', 'ID3')]

Interval approach

Second, subset by an interval when it is known the columns are the same:

# Column Variables
colvars = names(A)

# Get the first ID
start_loc = match("ID1",colvars)

# Get the second ID
end_loc = match("ID3",colvars)

# Subset range
b = A[,start_loc:end_loc]

Psidom · Answer 2 · 2016-06-05T21:48:27.490

3

If you are not restricted to data.frame, you can convert it to data.table and then your formula will work:

data.table::setDT(A)[, ID2:ID3, with=F]

   ID2  ID3
1:   5 1901
2:   5 1902

edited Jun 05 '16 at 21:48

answered Jun 05 '16 at 21:32

Psidom

209,562
33
339
356

score 1 · Answer 3 · answered May 07 '20 at 20:49

1

You want to use column names instead of numbers to select a column interval, right? Why not:

> b <- A[,c((which(colnames(A)=="ID2")):(which(colnames(A)=="ID3")))]
> b
# ID2 ID3
# 1 5 1901
# 2 5 1902

answered May 07 '20 at 20:49

HariSeldon

33
1
6

score 0 · Accepted Answer · answered Jun 06 '16 at 07:08

Use c() function, then it works when using column names

> A <- data.frame(ID1=c(1,1),ID2=c(5,5),ID3=c(01901,01902))
> A
#   ID1 ID2  ID3
# 1   1   5 1901
# 2   1   5 1902

> b <- A[,c(2:3)]
> b
#   ID2  ID3
# 1   5 1901
# 2   5 1902

> b1 <- A[,c("ID2","ID3")]
> b1
#   ID2  ID3
# 1   5 1901
# 2   5 1902

> b2 <- A[,2:3]
> b2
#   ID2  ID3
# 1   5 1901
# 2   5 1902

score 0 · Answer 5 · answered Jun 22 '21 at 14:50

0

If we want to use dplyr:

# create data frame A
A <- data.frame (ID1 = c("1", "2"),
             ID2 = c("5", "5"),
             ID3 = c("01901", "01902")
             )

# print A
A

# get data frame B
B <- A %>% select(ID2:ID3)

# print B
B

answered Jun 22 '21 at 14:50

Emma

643
6
7

Select a consecutive range of data.frame columns using names of beginning and end columns

5 Answers5

Named vector column subset

Interval approach