0

I have a dataframe like so:

state_id person_id salary
1        1         100000
1        2         100000
1        3         34560
2        4         67800
2        5         54670
3        6         67000
3        7         55000
3        8         55000
4        9         123450

I want to get a single person for each state with the highest salary. If there are two people with the same salary and it is the highest salary in the state - I want to just choose any one of them. How can I do this in R - dplyr?

Expected output:

state_id person_id salary
1        2         100000
2        4         67800
3        6         67000
4        9         123450
benson23
  • 16,369
  • 9
  • 19
  • 38
Eisen
  • 1,697
  • 9
  • 27

1 Answers1

2

We can use slice_max here.

library(dplyr)

df %>% group_by(state_id) %>% slice_max(salary, with_ties = F)

# A tibble: 4 × 3
# Groups:   state_id [4]
  state_id person_id salary
     <int>     <int>  <int>
1        1         1 100000
2        2         4  67800
3        3         6  67000
4        4         9 123450
benson23
  • 16,369
  • 9
  • 19
  • 38