0

Consider the following:

> df <- data.frame(x = c(1, 2, 1, 2, 3, 3), y = c(2, 1, 2, 3, 3, 3))
> df
  x y
1 1 2
2 2 1
3 1 2
4 2 3
5 3 3
6 3 3
> df[duplicated(df),]
  x y
3 1 2
6 3 3

As we can see above, the rows with (x, y) = (1, 2) and (x, y) = (3, 3) have been marked as a duplicates.

Using only base R (i.e., no additional packages are to be loaded), is there a way to get all rows in df that have an (x, y) pair identical to an (x, y) pair for a row in df[duplicated(df),]? In this case, the desired output is

  x y
1 1 2
3 1 2
5 3 3
6 3 3

I am particularly looking for a solution that would be ideal for explaining to someone who is new to R.

Clarinetist
  • 1,097
  • 18
  • 46
  • 3
    I think this is roughly what you want `df[duplicated(df) | duplicated(df, fromLast=TRUE),]`. – lmo Dec 09 '16 at 13:49
  • 1
    @lmo Well, I learned something today! Please feel free to post that as an answer, so that I can award you some points. – Clarinetist Dec 09 '16 at 13:53

0 Answers0