Finding a single mode for the data

Question

I wanted to know if we can find a single mode from the data set below:

  Name             Food Decor Service Price
107 West            16   13       16    26
2nd Street cafe     14   13       15    21
44 & Hell's kitchen 22   19       19    42
55 wall             21   22       21    54
55 wall street      21   22       21    54
92 sub              15   15       15    43
Angelica kitchen    20   14       15    22
Angelo's            21   11       14    22
Avenue              18   14       14    36

I am trying to use the which.max function but unable to get the desired output. Could you please help.

For example - Mode for food will be 21.

Thanks

What is the desired output? Not clear what you want to calculate. — neilfws, Sep 26 '17 at 00:11
the output for price should be 21. The data presented here is a part of a larger data — Rikin, Sep 26 '17 at 00:22
I'm still struggling to see from the data how 21 is anything to do with `mode` or `max` for `Price`. — neilfws, Sep 26 '17 at 00:25

neilfws · Answer 1 · 2017-09-26T01:07:12.183

Here's your data in reproducible format, with the duplicate row (55 wall street) omitted:

data1 <- structure(list(Name = c("107 West", "2nd Street cafe", "44 & Hell's kitchen", 
                                 "55 wall street", "92 sub", "Angelica kitchen", 
                                 "Angelos", "Avenue"), 
                        Food = c(16L, 14L, 22L, 21L, 15L, 20L, 21L, 18L), 
                        Decor = c(13L, 13L, 19L, 22L, 15L, 14L, 11L, 14L), 
                        Service = c(16L, 15L, 19L, 21L, 15L, 15L, 14L, 14L), 
                        Price = c(26L, 21L, 42L, 54L, 43L, 22L, 22L, 36L)), 
                        .Names = c("Name", "Food", "Decor", "Service", "Price"), 
                        class = "data.frame", row.names = c(NA, -8L))

We can use tidyr::gather and then dplyr to count the values in each column, then filter for the largest value.

library(dplyr)
library(tidyr)

data1 %>% 
  gather(key, value, -Name) %>% 
  group_by(key) %>% 
  count(value) %>% 
  filter(n == max(n)) %>%
  ungroup()

      key value     n
    <chr> <int> <int>
1   Decor    13     2
2   Decor    14     2
3    Food    21     2
4   Price    22     2
5 Service    15     3

If you insist on ugly base R solutions, here's one:

apply(data1[, 2:5], 2, function(x) names(table(x))[which(table(x) == max(table(x)))])

$Food
[1] "21"

$Decor
[1] "13" "14"

$Service
[1] "15"

$Price
[1] "22"

You could. But why would you go ugly when you could go elegant :) I'll add a base R solution. — neilfws, Sep 26 '17 at 01:05
It doesn't mean you have to make the base solution as bad as possible. `apply(data, 2, ...` is just `lapply` — thelatemail, Sep 26 '17 at 03:55

score 1 · Accepted Answer · answered Sep 26 '17 at 01:07

You can get this without extra libraries. For each variable, you can make a table of the values and apply which.max to find which value occurs most frequently. In the event of a tie, I take the first one.

Using the data as provided by @nielfws

as.numeric(sapply(data1[,2:5], function(x) names(which.max(table(x)))[1]))
[1] 21 13 15 22

It might be nice to label these so

Modes = as.numeric(sapply(data1[,2:5], 
    function(x) names(which.max(table(x)))[1]))
names(Modes) = colnames(data1[2:5])
Modes
   Food   Decor Service   Price 
     21      13      15      22

Finding a single mode for the data

2 Answers2