0

I have some code written using the dplyr package. I want to calculate the mode. Currently I get results back with a column which says "Character" all the way down. The mode will be the most reoccurring value, which in my case could be a letter, number of a symbol.

eth.data<-data.comb %>%
  group_by(Ethnicity, `Qualification Title`, `Qualification Number`, `OutGrade`)%>%
   summarise(`Number of Learners`=n(), `Mode` = mode(`OutGrade`)) %>%
  group_by(`Qualification Number`)%>%
  mutate(`Total Number of Learners`= sum(`Number of Learners`)) %>%
  arrange(`Total Number of Learners`)
Andrew Marshall
  • 95,083
  • 20
  • 220
  • 214
333
  • 29
  • 5
  • 2
    Welcome to stackoverfow. Data is required to make your problem reproducible such that people can analyze and evaluate your problem to find your answer numerically. Could you provide some data and the expected result ? – Carles Jul 25 '19 at 12:06
  • 1
    Take a look at `?mode`. `mode` tells you the storage mode of an object (e.g. "character" for character vectors). If you want the statistical mode, write your own function, see [this question](https://stackoverflow.com/questions/17374651/finding-the-most-common-elements-in-a-vector-in-r). Also, if you `group_by` OutGrade, then you will have precisely 1 unique OutGrade in the `summarise` function, so don't do that. – January Jul 25 '19 at 12:09

1 Answers1

0

Take a look at ?mode. mode tells you the storage mode of an object (e.g. "character" for character vectors). If you want the statistical mode, write your own function, see this question.

Also, if you group_by OutGrade, then you will have precisely 1 unique OutGrade in the summarise function, so don't do that.

Let us set up an example (which you should do when you are asking a question!).

df <- data.frame(group=rep(LETTERS[1:5], each=20), 
                 grade=sample(letters[1:15], 100, replace=T)) 
mymode <- function(x) {
                t <- table(x)
                names(t)[ which.max(t) ]
          }
df %>% group_by(group) %>% summarise(mode=mymode(grade))

The result is what you want:

# A tibble: 5 x 2
  group mode 
  <chr> <chr>
1 A     l    
2 B     f    
3 C     g    
4 D     g    
5 E     c  

Note that if you did group_by(group, grade), the summarise function would be called for each combination of group and grade, so the results would have been very different:

# A tibble: 55 x 3
# Groups:   group [5]
   group grade mode 
   <chr> <chr> <chr>
 1 A     a     a    
 2 A     b     b    
 3 A     f     f    
 4 A     h     h    
 5 A     i     i    
 6 A     k     k    
 7 A     l     l    
 8 A     m     m    
 9 A     n     n    
10 B     a     a    
# … with 45 more rows
January
  • 16,320
  • 6
  • 52
  • 74