Get column index from label in a data frame

Question

Say we have the following data frame:

We can select column 'B' from its index:

> df[,2]
[1] 2 5 8

Is there a way to get the index (2) from the column label ('B')?

See @matthewdowle's answer here for the best solution: http://stackoverflow.com/a/9277935/636656 — Ari B. Friedman, Jul 14 '14 at 16:20

Henrik · Accepted Answer · 2010-12-13T09:47:26.453

136

you can get the index via grep and colnames:

grep("B", colnames(df))
[1] 2

or use

grep("^B$", colnames(df))
[1] 2

to only get the columns called "B" without those who contain a B e.g. "ABC".

edited Dec 13 '10 at 09:47

answered Dec 13 '10 at 09:35

Henrik

14,202
10
68
91

1

Your original example's advantages could be demonstrated in code if you showed its use in something like df[ , grep("^B", colnames(df)) ], i.e, returning the dataframe columns starting with "B". Feel free to use in a further edit if you agree. – IRTFM Dec 13 '10 at 14:56
2

Or even df[ , grep("^[BC]", colnames(df)) ], i.e., the columns that start with either B or C. – IRTFM Dec 13 '10 at 15:03
@Dwin: As @aix already said, the asker wants the *index*. But I also usually use `grep` the way you describe it. – Henrik Dec 13 '10 at 16:05
@Henrik. Thank you so much. This must be the single most useful command to work with dplyr and variables! – user989762 Feb 11 '16 at 09:39

NPE · Answer 2 · 2012-04-27T09:59:34.753

108

The following will do it:

which(colnames(df)=="B")

edited Apr 27 '12 at 09:59

answered Dec 13 '10 at 09:40

NPE

486,780
108
951
1,012

2

The problem with `grep` is also the advantage, namely that it uses regular expressions (so you can search for any pattern in your colnames). To just get the colnames "B" use `"^B$"` as the pattern in grep. ^ is the metacharacter for the beginning and $ for the end of a string. – Henrik Dec 13 '10 at 09:44
10

You don't even need `which`. You can directly use `df[names(df)=="B"]` – nico Dec 13 '10 at 09:56
5

@nico The question is to get the *index* of the column. – NPE Dec 13 '10 at 10:33
"Which" worked for me in every case. I couldn't get a column with the name "fBodyAcc-meanFreq()-Z" using grep. – Panos Kalatzantonakis Mar 07 '13 at 22:44
1

@Kabamaru: Grep will work as long as you escape the metacharacters. For the example you gave, this will work: `grep("^fBodyAcc-meanFreq\$)-Z$",colnames(df))` or also `grep("^fBodyAcc-meanFreq\\(\$-Z$",colnames(df))`. – Steve May 22 '13 at 04:51
@NPE Can you guide me how can I pass the name of a column to a variable. For example with following line of code I want to pass colname of column 4 to variable a: `a <- colnames(df[,4])` but it is not working. – Newbie Aug 04 '16 at 14:08

score 8 · Answer 3 · answered Aug 24 '17 at 19:51

8

I wanted to see all the indices for the colnames because I needed to do a complicated column rearrangement, so I printed the colnames as a dataframe. The rownames are the indices.

as.data.frame(colnames(df))

1 A
2 B
3 C

answered Aug 24 '17 at 19:51

chimeric

855
1
9
14

2

A more concise way to do this is `cbind(names(df))`. – lillemets Feb 12 '18 at 11:35
@lillemets if brevity is your goal, `t(t(names(df)))` saves you 2 characters ;) – Gregor Thomas Oct 16 '21 at 00:28

Grant Shannon · Answer 4 · 2019-09-27T13:18:33.490

6

Following on from chimeric's answer above:

To get ALL the column indices in the df, so i used:

which(!names(df)%in%c())

or store in a list:

indexLst<-which(!names(df)%in%c())

edited Sep 27 '19 at 13:18

answered Jun 29 '18 at 08:52

Grant Shannon

4,709
1
46
36

1

i think this is the best answer because it can be generalized – Dimitrios Zacharatos Oct 16 '19 at 09:48

score 3 · Answer 5 · answered Jun 01 '18 at 20:53

This seems to be an efficient way to list vars with column number:

cbind(names(df))

Output:

     [,1]
[1,] "A" 
[2,] "B" 
[3,] "C"

Sometimes I like to copy variables with position into my code so I use this function:

varnums<- function(x) {w=as.data.frame(c(1:length(colnames(x))),
          paste0('# ',colnames(x)))
names(w)= c("# Var/Pos")
w}
varnums(df)

Output:

# Var/Pos
# A         1
# B         2
# C         3

score 2 · Answer 6 · edited Mar 03 '20 at 12:36

2

match("B", names(df))

Can work also if you have a vector of names.

edited Mar 03 '20 at 12:36

Vesanen

387
1
5
13

answered Dec 09 '19 at 23:14

James Holland

1,102
10
17

BlueCrustacean5 · Answer 7 · 2021-10-16T00:16:34.763

1

To generalize @NPE's answer slightly:

which(colnames(dat) %in% var)

where var is of the form

c("colname1","colname2",...,"colnamen")

returns the indices of whichever column names one needs.

edited Oct 16 '21 at 00:16

answered Oct 15 '21 at 23:54

BlueCrustacean5

33
1
7

score 0 · Answer 8 · answered Nov 28 '18 at 13:42

0

Use t function:

t(colnames(df))

     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]  
[1,] "var1" "var2" "var3" "var4" "var5" "var6"

answered Nov 28 '18 at 13:42

neves

796
2
10
36

score 0 · Answer 9 · answered Jan 11 '21 at 20:48

0

Here is an answer that will generalize Henrik's answer.

df=data.frame(A=rnorm(100), B=rnorm(100), C=rnorm(100))
numeric_columns<-c('A', 'B', 'C')
numeric_index<-sapply(1:length(numeric_columns), function(i)
grep(numeric_columns[i], colnames(df)))

answered Jan 11 '21 at 20:48

Jimmy TwoCents

165
1
1
9

That `sapply` is a long way to write `match(numeric_columns, names(df))` --- unless you really need the regex power rather than exact string matching. – Gregor Thomas Oct 16 '21 at 00:31
thanks @GregorThomas...not super familar with match. In this case it is quite a bit shorter, but I like the sapply because it's a little more explicit what is going on...to each their own i guess (havem't benchmarked any performance differences) – Jimmy TwoCents Nov 17 '21 at 23:19

score 0 · Answer 10 · edited Apr 08 '22 at 23:10

#I wanted the column index instead of the column name. This line of code worked for me:

which (data.frame (colnames (datE)) == colnames (datE[c(1:15)]), arr.ind = T)[,1]

#with datE being a regular dataframe with 15 columns (variables)

data.frame(colnames(datE))
#>    colnames.datE.
#> 1              Ce
#> 2              Eu
#> 3              La
#> 4              Pr
#> 5              Nd
#> 6              Sm
#> 7              Gd
#> 8              Tb
#> 9              Dy
#> 10             Ho
#> 11             Er
#> 12              Y
#> 13             Tm
#> 14             Yb
#> 15             Lu

which(data.frame(colnames(datE))==colnames(datE[c(1:15)]),arr.ind=T)[,1]
#> [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

Get column index from label in a data frame

10 Answers10

Linked

Related