I have data with columns "ID" and "value" in which ID might be repeated. I would like to find all rows which have duplicate IDs and just keep the one with the higher value.
mydf <- data.frame(ID = c(1,2,2,3,4), value = c(5, 8, 20, 18,15))
I am working w dplyr. So far I can find the duplicates
find_dup <- function(dataset, var) {
dataset %>% group_by({{var}}) %>% filter(n() >1) %>% ungroup %>% arrange({{var}})
}
find_dup(mydf, ID)
But am having trouble with the replace step, not sure how to "point to" the larger value. Hoping to stay with a tidyverse solution for now if possible. Any thoughts welcome, Thx!