Use dplyr to add a new column of based on max row value?

Question

I've got a large database that has a series of columns with numerical. I would like to use dplyr to add a new column, mutate, which has as its values the names of the column that has the maximum value. So, for the example below

set.seed(123)
data_frame(
  bob = rnorm(10),
  sam = rnorm(10),
  dick = rnorm(10)
    )
# A tibble: 5 x 3
      bob    sam   dick
    <dbl>  <dbl>  <dbl>
1 -0.560   1.72   1.22 
2 -0.230   0.461  0.360
3  1.56   -1.27   0.401
4  0.0705 -0.687  0.111
5  0.129  -0.446 -0.556

the new column would be equal to c('sam', 'sam', 'bob', 'dick', 'bob') because they have the maximum values of the columns in the dataset. Any thought?

Related: [Get names of column with max value for each row](https://stackoverflow.com/questions/19087496/get-names-of-column-with-max-value-for-each-row) — Henrik, Sep 29 '18 at 19:54
[For each row return the column name of the largest value](https://stackoverflow.com/questions/17735859/for-each-row-return-the-column-name-of-the-largest-value) — Henrik, Sep 29 '18 at 20:00

score 6 · Accepted Answer · answered Sep 29 '18 at 19:41

6

This will work fine:

df$result = names(df)[apply(df, 1, which.max)]

answered Sep 29 '18 at 19:41

Gregor Thomas

136,190
20
167
294

1

Thanks! Perfect, for mutate I have `mutate(max = names(.)[apply(., 1, which.max)])` – elliot Sep 29 '18 at 19:46
3

Or `names(d)[max.col(d)]` – Henrik Sep 29 '18 at 19:51

Nettle · Answer 2 · 2018-10-01T00:10:28.297

More verbose, but tidyverse-friendly:

df %>% 
  #tidying
    mutate(id = row_number()) %>% 
    gather(name, amount, -id) %>% 
    group_by(id) %>%  arrange(id, desc(amount)) %>% 
  #workhorse
    mutate(top.value = head(name, 1) ) %>% 
  #Pivot
    spread(name, amount)

   # A tibble: 10 x 5
# Groups:   id [10]
      id top.value     bob   dick    sam
   <int> <chr>       <dbl>  <dbl>  <dbl>
 1     1 sam       -0.560  -1.07   1.22 
 2     2 sam       -0.230  -0.218  0.360
 3     3 bob        1.56   -1.03   0.401
 4     4 sam        0.0705 -0.729  0.111
 5     5 bob        0.129  -0.625 -0.556
 6     6 sam        1.72   -1.69   1.79 
 7     7 dick       0.461   0.838  0.498
 8     8 dick      -1.27    0.153 -1.97 
 9     9 sam       -0.687  -1.14   0.701
10    10 dick      -0.446   1.25  -0.473

If you don't feel like using tidy data, try:

df %>% 
  mutate(max.name = names(.)[max.col(.)]  )

Where's the love people? I don't think this deserves a negative. If so, why? — Nettle, Oct 09 '18 at 18:12

PavoDive · Answer 3 · 2018-10-09T18:39:09.087

2

a data.table version for those that will land in this question looking for a data.table alternative:

require(data.table)
setDT(df)
df[, m := names(df)[apply(.SD, 1, which.max)]]

edited Oct 09 '18 at 18:39

answered Oct 01 '18 at 04:16

PavoDive

6,322
2
29
55

Use dplyr to add a new column of based on max row value?

3 Answers3

Linked