This may be a failure to know the right keywords to search, but I'm looking for a way remove duplicates based on an an order reversal between two non-numeric columns. Here is a very small subset of my data:
ANIMAL1<-c("20074674_K.v1","20085105_K.v1","20085638_K.v1","20085646_K.v1")
ANIMAL2<-c("20085105_K.v1","20074674_K.v1","20074674_K.v1","20074674_K.v1")
exclusions<-c(13,13,5,10)
data<-data.frame(ANIMAL1,ANIMAL2,exclusions)
ANIMAL1 ANIMAL2 exclusions
1 20074674_K.v1 20085105_K.v1 13
2 20085105_K.v1 20074674_K.v1 13
3 20085638_K.v1 20074674_K.v1 5
4 20085646_K.v1 20074674_K.v1 10
The first and second row are duplicate comparisons, the order of animals is just reversed between the first two columns. It doesn't matter which one is deleted, but I want to delete one of the duplicates... and all the rest of the duplicates that fit this logic in my larger dataframe. I'm used to subsetting according to the logic in these questions: Remove duplicate column pairs, sort rows based on 2 columns and the other posts that come up with searching "remove duplicates based on 2 columns" but I haven't yet found anything yet that approximates my use case. Here is what I would like my data to look like after the duplication removal:
ANIMAL1 ANIMAL2 exclusions
1 20085105_K.v1 20074674_K.v1 13
2 20085638_K.v1 20074674_K.v1 5
3 20085646_K.v1 20074674_K.v1 10
Thanks much!