using a logical expression containing %in% with duplicated function in r

Asked May 21 '18 at 13:45

Active May 21 '18 at 13:45

Viewed 27 times

I am dealing with a set of variables that look something like below

   IndexA IndexB     
1     A     B 
2     B     A 
3     A     C

I would like to remove all duplicates where A is in B and B is in A. So that the resulting dataset looks like this:

   IndexA IndexB     
1     A     B 
2     A     C

I have managed to obtain a list of all of the duplicates using subsetting e.g.

    duplicates<-df[df$IndexA %in% df$IndexB & df$IndexB %in% df$IndexA,]

However, I want to be able to retain only one of the duplicates that is returned for each case where df$IndexA %in% df$IndexB & df$IndexB %in% df$IndexA

asked May 21 '18 at 13:45

user183974

1

Sort rowwise, then remove duplicates? – zx8754 May 21 '18 at 14:00
You mean manually remove each duplicate? I was after something a bit more automated since the list of duplicates I have is over 200 cases – user183974 May 21 '18 at 14:01
Something like: `df1[ !duplicated(t(apply(df1, 1, sort))), ]` ? – zx8754 May 21 '18 at 14:02
See also the "Linked" questions in the dupe for other alternatives. – Henrik May 21 '18 at 14:03

using a logical expression containing %in% with duplicated function in r

0 Answers0