1
ID Amount Previous 
1  10     15
1  10     13
2  20     18
2  20     24
3  5      7
3  5      6

I want to remove the duplicate rows from the following data frame, where ID and Amount match. Values in the Previous column do not match. When deciding which row to take, I'd like to take the one where the Previous column value is higher.

This would look like:

ID Amount Previous 
1  10     15
2  20     24
3  5      7
Alice Wang
  • 183
  • 7

1 Answers1

0

An option is distinct on the columns 'ID', 'Amount' (after arrangeing the dataset) while specifying the .keep_all = TRUE to get all the other columns that correspond to the distinct elements in those columns

library(dplyr)
df1 %>% 
    arrange(ID, Amount, desc(Previous)) %>%
    distinct(ID, Amount, .keep_all = TRUE)
#   ID Amount Previous
#1  1     10       15
#2  2     20       24
#3  3      5        7

Or with duplicated from base R applied on the 'ID', 'Amount' to create a logical vector and use that to subset the rows of the dataset

df2 <- df1[with(df1, order(ID, Amount, -Previous)),]
df2[!duplicated(df2[c('ID', 'Amount')]),]
#  ID Amount Previous
#1  1     10       15
#3  2     20       24
#5  3      5        7

data

df1 <- structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 3L), Amount = c(10L, 
10L, 20L, 20L, 5L, 5L), Previous = c(15L, 13L, 18L, 24L, 7L, 
6L)), class = "data.frame", row.names = c(NA, -6L))
akrun
  • 874,273
  • 37
  • 540
  • 662