I have a 3-columns data.frame (variables: ID.A
, ID.B
, DISTANCE
). I would like to remove the duplicates under a condition: keeping the row with the smallest value in column 3.
It is the same problem than here : R, conditionally remove duplicate rows (Similar one: Remove duplicates based on 2nd column condition)
But, in my situation, there is second problem : I have to remove rows when the couples (ID.A
, ID.B
, DISTANCE
) are duplicated, and not only when ID.A
is duplicated.
I tried several things, such as:
df <- ddply(df, 1:3, function(df) return(df[df$DISTANCE==min(df$DISTANCE),]))
but it didn't work
Example :
This dataset
id.a id.b dist
1 1 1 12
2 1 1 10
3 1 1 8
4 2 1 20
5 1 1 15
6 3 1 16
Should become:
id.a id.b dist
3 1 1 8
4 2 1 20
6 3 1 16