2

I am trying to calculate the mode for exam grades by ethnicity and the qualification taken, which I have also raised in a previous question. The issue I am having is that the mode function is not performing as it should be. The new column produced by the summarise function, called "mode", just gives a repeat of the OutGrade column. I want a mode for each variation of Ethnicity and Qualification Title. Below are my mode function, and a snippet of my R code

I have tried the use of various mode functions and changed the ordering of my code but no success.

Mode <- function(x) {
uni <- unique(x)
uni[which.max(tabulate(match(x, uni)))]
}

#Ethnic data
eth.data2<-data.comb%>%
group_by(Ethnicity, `Qualification Title`, OutGrade)%>%
summarise(n=n(), mode=Mode(OutGrade))
333
  • 29
  • 5
  • 1
    Could you add sample data and expected ouptut? Use `dput` for data. See [this question](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for further guidance. – NelsonGon Aug 05 '19 at 14:51

1 Answers1

3

The issue would be that plyr was also loaded. To avoid the function masking, either do it on a fresh session with only dplyr loaded or use :: to specify the function from dplyr

data.comb%>%
   group_by(Ethnicity, `Qualification Title`, OutGrade)%>%
   dplyr::summarise(n=n(), mode=Mode(OutGrade))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Great thanks for this, it now works. A follow up question. How do I get R to give me more than one mode, if there are 2 (or more) equal frequencies? At the moment it appears that R is just printing one of the modes, and ignoring the others even if they have an equal frequency – 333 Aug 07 '19 at 08:57
  • @awz1. The `which.max` in the `Mode` returns the first index if there are ties for `max`. You can change it to `tbl <- tabulate(match(x, uni)); un1[tbl == max(tbl)]` andd then in `summarise`, wrap in a `list` i.e. `mode = list(Mode(OutGrade))` as `summarise` returns only length 1 for each group – akrun Aug 07 '19 at 13:06