Keep most frequent value per group

Question

I need to summarise some data.frame containing both numeric and character values. I need to sum the numerical features and to keep the most frequent character value per group at the same time. Ideally, it should be with dplyr but that's not mandatory.

Here is a reproducible example :

have <- data.frame("id" = c(1,1,1,2,2,2),"a" = c("dog","dog","cat","duck","duck","horse"),"value" = c(1,2,3,4,5,6))
need <- data.frame("id" = c(1,2), "a" = c("dog","duck"), "value" = c(6,15))

Any idea ?

While @Quiten's answer is totally valid and probably the way to go here, you might wanna check this question: https://stackoverflow.com/q/2547402/13460602 where they discuss how to find the statistical mode (i.e. the most frequent value). — Maël, Mar 10 '23 at 09:54

score 4 · Answer 1 · answered Mar 10 '23 at 09:59

Another option is to use collapse::fmode which is probably the fastest way of finding the statistical mode:

library(dplyr) 
have %>%
  group_by(id) %>%
  summarise(a = collapse::fmode(a),
            value = sum(value)) %>%
  ungroup()

#> # A tibble: 2 × 3
#>      id a     value
#>   <dbl> <chr> <dbl>
#> 1     1 dog       6
#> 2     2 duck     15

score 3 · Accepted Answer · answered Mar 10 '23 at 09:47

You could group_by and sum for total value and use table with which.max and names to select the character values that frequent the most like this:

library(dplyr)
have %>%
  group_by(id) %>%
  summarise(a = names(which.max(table(a))),
            value = sum(value))
#> # A tibble: 2 × 3
#>      id a     value
#>   <dbl> <chr> <dbl>
#> 1     1 dog       6
#> 2     2 duck     15

^{Created on 2023-03-10 with reprex v2.0.2}

score 3 · Answer 3 · answered Mar 10 '23 at 09:55

Here we first count and then apply the max a to all other and finally summarise:

library(dplyr) # dplyr >= 1.1.0

have %>% 
  add_count(id, a) %>% 
  mutate(a = first(a), .by=id) %>% 
  summarise(value = sum(value), .by=c(id, a))
 
  id    a value
1  1  dog     6
2  2 duck    15

Keep most frequent value per group

3 Answers3