1

I have a dataset in R, which includes IDs and no. of occurrences:

ID    Occurrences
1001   A
1001   A
1001   B
1002   C
1002   A
1002   C

I would like to get the output as ID (unique) and occurrence (mode), like this:

ID     Occurrences
1001   A
1002   C

How can I do this in R? I have tried something like "table" but I am not getting a proper answer.

user213544
  • 2,046
  • 3
  • 22
  • 52
  • 3
    Why isn't B a part of your answer for 1001? And A part of 1002? Do you just want the first example of each ID? – G5W Feb 10 '19 at 18:28
  • Your question seems to be very similar to [this one](https://stackoverflow.com/questions/32684931/how-to-aggregate-data-in-r-with-mode-most-common-value-for-each-row). Maybe the accepted answer there will help you? – camerondm9 Feb 10 '19 at 19:39

3 Answers3

3

After grouping by 'ID', get the 'mode' of the 'Occurrences'

library(dplyr)
df1 %>%
   group_by(ID) %>%
   summarise(Occurrences = Mode(Occurrences))
# A tibble: 2 x 2
#    ID Occurrences
#  <int> <chr>      
#1  1001 A          
#2  1002 C      

where Mode is

Mode <- function(x) {
   ux <- unique(x)
   ux[which.max(tabulate(match(x, ux)))]
 }

Or using base R

aggregate(Occurrences ~ ID, df1, FUN = Mode)

data

df1 <- structure(list(ID = c(1001L, 1001L, 1001L, 1002L, 1002L, 1002L
 ), Occurrences = c("A", "A", "B", "C", "A", "C")),
 class = "data.frame", row.names = c(NA, -6L))
akrun
  • 874,273
  • 37
  • 540
  • 662
2

A base R answer without any fancy functions or packages

df[!duplicated(df$ID) & !duplicated(df$Occurrences),]
> ID Occurrences
1 1001           A
4 1002           C
Ben373
  • 951
  • 7
  • 16
1

Using base R aggregate

aggregate(df1,by=list(df1$ID),FUN=function(x) names(sort(-table(x)))[1] )[,names(df1)]
    ID Occurrences
1 1001           A
2 1002           C
BENY
  • 317,841
  • 20
  • 164
  • 234