0

I have dataframe with id and desicion of yes and no, for example

a <- data.frame(id = c(1,1,1,1,1,2,2,2,3,3,3,3), val = c("y","y","y","n","n","n","n","y","n","n","y","y"))

original data frame

I want to output for each unique id the most repeated value. i.e example of result for this dataframe would be

data.frame(id = c(1,2,3), val = c("y","n","y"))[result][2]

or

data.frame(id = c(1,2,3), val = c("y","n","n"))

Any help is appreciated thanks in advance!

camille
  • 16,432
  • 18
  • 38
  • 60
  • [This](https://stackoverflow.com/q/29255473/5325862) and its linked posts might help you get started – camille Mar 09 '20 at 18:10
  • Also see [here](https://stackoverflow.com/q/32684931/5325862) and [here](https://stackoverflow.com/q/37944044/5325862) – camille Mar 09 '20 at 18:17
  • Does this answer your question? [Most frequent value (mode) by group](https://stackoverflow.com/questions/29255473/most-frequent-value-mode-by-group) – AMC Mar 09 '20 at 18:52

2 Answers2

0

We can do a group by 'id' and apply the Mode on 'val'

library(dplyr)
Mode <- function(x) {
   ux <- unique(x)
   ux[which.max(tabulate(match(x, ux)))]
 }

a %>% 
     group_by(id) %>% 
     summarise(val = Mode(val))
# A tibble: 3 x 2
#     id val  
#  <dbl> <fct>
#1     1 y    
#2     2 n    
#3     3 n    

Or using only base R

aggregate(val ~ id, a, Mode)
#   id val
#1  1   y
#2  2   n
#3  3   n
akrun
  • 874,273
  • 37
  • 540
  • 662
0

another option

library(dplyr)
df %>% 
  count(id, val) %>% 
  group_by(id) %>% 
  top_n(1, n) %>% 
  select(-n)
Yuriy Saraykin
  • 8,390
  • 1
  • 7
  • 14