0

I have a dataset that looks something like the table below with multiple lines corresponding to each month, the count of how many sightings of a certain bird there are (n), and in what region that sighting occurred.

MONTH REGION n
1 North 12
1 South 45
2 West 34
2 South 23
2 East 32
3 North 11 and so on.

What I am looking to do is to create a separate dataset that extracts the region with the most sightings per month, so the goal is something that looks like this:

MONTH REGION n
1 South 45
2 West 34
3 North 11

So far I have tried different combinations of piping such as df %>% group_by(MONTH) %>% max(n), but none have gave the desired result.

zephryl
  • 14,633
  • 3
  • 11
  • 30
Jjohn
  • 11
  • 2

1 Answers1

0

You could use :

library(dplyr)
# your original data
df <- data.frame(
  MONTH = c(1, 1, 2, 2, 2, 3),
  REGION = c("North", "South", "West", "South", "East", "North"),
  n = c(12, 45, 34, 23, 32, 11)
)


# extract the regions with most sightings 
df_max <- df %>%
    group_by(MONTH) %>%
 slice(which.max(n))

#result
df_max
# A tibble: 3 × 3
# Groups:   MONTH [3]
  MONTH REGION     n
  <dbl> <chr>  <dbl>
1     1 South     45
2     2 West      34
3     3 North     11
S-SHAAF
  • 1,863
  • 2
  • 5
  • 14