0

I'm wanting to calculate a mode value for my dataset (split by other factors, so each group has its own mode value, which I'll probably be doing using dplyr if that makes a difference). I have found this question that discusses a function for locating a modal value.

The thing is, the code pointed to above simply returns the first modal value, so what value is returned varies depending on the order of the dataset. I instead want to create two functions, focussing on the highest and lowest modes in multi-modal distributions.

For example, in the vector

x <- c(4.0, 1.0, 2.2, 2.2, 2.2, 4.0, 0.3, 4.0)

I would want minmode(x) to return 2.2, and maxmode(x) to return 4.0. Can anyone explain to me how to adapt the code linked above (or create a new function) to do so?

Margaret
  • 5,749
  • 20
  • 56
  • 72

2 Answers2

2

Another approach in base R by modifying the code you linked to (digEmAll's suggestion on the accepted answer):

Mode <- function(x) {
  ux <- unique(x)
  tab <- tabulate(match(x, ux))
  ux[tab == max(tab)]
}

This will return all the modes, which you could then use for minmode(x) or maxmode(x):

x <- c(4.0, 1.0, 2.2, 2.2, 2.2, 4.0, 0.3, 4.0)
min(Mode(x))
# [1] 2.2
max(Mode(x))
# [1] 4
Callum Webb
  • 354
  • 2
  • 8
1

It seems you want the range of the most frequent value. Using a tidy approach, I would tackle this way.

library(dplyr)

mode_range <- function(df, x) {

  require(dplyr, quietly = TRUE)

  var <- quo(x)

  val <- df %>%
    group_by(!!var) %>%
    summarise(n = n()) %>%
    filter(n == max(n)) %>%
    select(!!var) %>%
    unlist

  range(val)

}

df <- tibble(x = c(4.0, 1.0, 2.2, 2.2, 2.2, 4.0, 0.3, 4.0))

mode_range(df, x)[1] # min value
# [1] 2.2

mode_range(df, x)[2] # max value
# [1] 4
Kevin Arseneau
  • 6,186
  • 1
  • 21
  • 40
  • I am not strictly after the range. My final result is I am creating a table with the count, mean value, trimmed mean value, median value, low mode value, high mode value, etc for each group - I think it will be easier for them to be added to the table if they're separate calls rather than one call that I then have to split out somehow... – Margaret Dec 19 '17 at 00:22