1

I have a df that takes this general form:

ID    votes
1      65
1      85
2      100
2      20
2      95
3      50
3      60

I want to create a new df that takes the two highest values in votes for each ID and shows their difference. The new df should look like this:

ID    margin
1      20
2      5
3      10

Is there a way to use dplyr for this?

user3385922
  • 179
  • 1
  • 1
  • 9
  • Possible duplicate. https://stackoverflow.com/questions/22589851/build-difference-between-groups-with-dplyr-in-r – bbiasi May 26 '19 at 01:37

1 Answers1

0

An option would be to be arrange by 'ID', 'votes' (either in descending or ascending), grouped by 'ID' and get the diff of the first two 'votes'

library(dplyr)
df1 %>%
    arrange(ID, desc(votes)) %>%
    group_by(ID) %>%
    summarise(margin = abs(diff(votes[1:2])))
# A tibble: 3 x 2
#     ID margin
#  <int>  <int>
#1     1     20
#2     2      5
#3     3     10

Or another option is

df1 %>%
  group_by(ID) %>% 
  summarise(margin = max(votes) - max(votes[-which.max(votes)]))

Or with slice and diff

df1 %>% 
   group_by(ID) %>% 
   slice(row_number(votes)[1:2]) %>% 
   summarise(margin = diff(votes))

data

df1 <- structure(list(ID = c(1L, 1L, 2L, 2L, 2L, 3L, 3L), votes = c(65L, 
85L, 100L, 20L, 95L, 50L, 60L)), class = "data.frame", row.names = c(NA, 
-7L))
akrun
  • 874,273
  • 37
  • 540
  • 662