Extract the column names for each row which meets a condition

Question

d <- structure(
  list(
    Cl = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    SaCl = c(0, 1, 0, 0,0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0), 
    SiCl = c(0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L, 0L), 
    ClLo = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    SiClLo = c(0L, 0L, 0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    SaClLo = c(1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1), 
    SaLo = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    SaSiLo = c(0L, 0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    SiLo = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    LoSa = c(0L, 0L, 0L, 0L,0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), 
    Sa = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,0L, 0L, 0L, 0L, 0L, 0L, 0L)
  ), 
  row.names = c(NA, 20L),
  class = "data.frame"
)

Each row has only one 1. I want to extract the column name which has 1 for each row such that my dataframe looks like

row.id | names
-------+-------
     1 | SaClLo
     2 | SaCl
     3 | SaClLo
     4 | SaClLo

I tried to run a function to each row

apply(d, 1, function(x) colnames(x)[x == 1])

This is giving me NULL.

score 4 · Accepted Answer · answered May 22 '18 at 21:59

Use max.col to find the positions of the 1s and use this vector to select the respective column names.

data.frame(row.id = 1:nrow(d),
           names = names(d)[max.col(d)])
#   row.id  names
#1       1 SaClLo
#2       2   SaCl
#3       3 SaClLo
#4       4 SaClLo
#...

divibisan · Answer 2 · 2018-05-22T21:38:17.007

For each row, we find which column has a value of 1, then select the value of colnames for that row. Then we convert it to a data.frame

data.frame(names = apply(d, 1, function(x) colnames(d)[which(x == 1)]))

    names
1  SaClLo
2    SaCl
3  SaClLo
4  SaClLo
...

Optionally, you can run it through tibble::rowname_to_column() to change the row.id from rownames to a column.

data.frame(names = apply(d, 1, function(x) colnames(d)[which(x == 1)])) %>%
    tibble::rownames_to_column()

   rowname  names
1        1 SaClLo
2        2   SaCl
3        3 SaClLo
4        4 SaClLo
...

score 1 · Answer 3 · answered May 22 '18 at 22:24

A little-known feature of which is your friend:

> which(d==1, arr.ind=TRUE)
   row col
2    2   2
11  11   2
15  15   2
13  13   4
...

The second column is the information you need:

> arr_indices <- which(d == 1, arr.ind = TRUE)
> colnames(d)[ arr_indices[, 2] ]
 [1] "SaCl"   "SaCl"   "SaCl"   "ClLo"   "SaClLo" "SaClLo" "SaClLo" "SaClLo"
 [9] "SaClLo" "SaClLo" "SaClLo" "SaClLo" "SaClLo" "SaClLo" "SaClLo" "SaClLo"
[17] "SaClLo" "SaClLo" "SaClLo" "SaClLo"

And you can put this into a data frame or whatever. I like this answer because it is relatively easy-to-read code.

Extract the column names for each row which meets a condition

3 Answers3