How to get the largest column value from each row using dplyr

Question

Given the following data :

df <- data.frame(
  a = c(1,2,3,5),
  b = c(7,9,52,4),
  c = c(53, 11,22,1),
  d = c("something","string","another", "here")
)

Which looks as :

  a  b  c         d
1 1  7 53 something
2 2  9 11    string
3 3 52 22   another
4 5  4  1      here

I would like to create column "max" using dplyr, where max is the column of the largest row value.

So for the above I would have

  a  b  c         d  max
1 1  7 53 something   c
2 2  9 11    string   c
3 3 52 22   another   b
8 5  4  1      here   a

Thanks, Still the last value for `max` is not matching as `b` — akrun, Dec 06 '19 at 23:52

akrun · Accepted Answer · 2019-12-07T00:02:14.657

We can use max.col to find the column index of maximum value on each row, use that to get the column name and assign ass 'max' column

df['max'] <- names(df)[1:3][max.col(df[1:3], "first")]
df
#  a  b  c         d max
#1 1  7 53 something   c
#2 2  9 11    string   c
#3 3 52 22   another   b
#4 5  4  1      here   a

With tidyverse, another approach is to reshape into 'long' format and then find the max

library(dplyr)
library(tidyr)
df %>%
   mutate(ind = row_number()) %>%
   select(-d) %>%
   pivot_longer(cols = a:c) %>%
   group_by(ind) %>%
   slice(which.max(value)) %>%
   select(-value) %>%
   pull(name) %>%
   mutate(df, max = .)

Or with pmap

library(purrr)
df %>% 
   mutate(max = pmap_chr(select(., a:c), ~ c(...) %>% 
                                   which.max %>% 
                                   names ))

thank you for the different approaches, purrr looks interesting, if a little confusing (I have never used it). It seems that the other solution using dplyr ([here](https://stackoverflow.com/a/59221774/3130747)) is a fair bit shorter, I'm just curious as to whether there were any particular reasons for the two approaches. Is one considered more "dplyr-ish" than the other? — baxx, Dec 07 '19 at 01:06

score 2 · Answer 2 · answered Dec 06 '19 at 23:50

2

apply(df,2,max) >> assuming your dataframe is named df

answered Dec 06 '19 at 23:50

Jorge Lopez

467
4
10

now, if you want the largest by row (and excluding last column) is apply(df[1:ncol(df)-1],1,max) – Jorge Lopez Dec 06 '19 at 23:53

score 1 · Answer 3 · answered Dec 06 '19 at 23:55

df %>%
    group_by(ind = row_number()) %>%
    mutate(max = c("a", "b", "c")[which.max(c(a, b, c))]) %>%
    ungroup() %>%
    select(-ind)
## A tibble: 4 x 5
#      a     b     c d         max  
#  <dbl> <dbl> <dbl> <fct>     <chr>
#1     1     7    53 something c    
#2     2     9    11 string    c    
#3     3    52    22 another   b    
#4     4     5     1 here      b

How to get the largest column value from each row using dplyr

3 Answers3