Subset dataframe by removing all unique rows

Question

I have a dataframe object("ed") like this:

        C1         C2   C3     C4      C5 C6         C7 C8 C9
1  5432750 11/05/2007 2007 354140 2045249  A 11/07/1951  F  M
2  6040226 07/01/2008 2008 354140 2755130  B 25/05/1969  M  N
3  6019750 05/05/2008 2008 354140 2755130  C 29/01/1999  M  O
4  6082148 09/05/2008 2008 354220 2751143  D 22/06/1990  F  P
5  6082149 10/05/2008 2008 354220 2751143  D 22/06/1990  F  P
6  6082150 11/05/2008 2008 354220 2751143  D 22/06/1990  F  P
7  5613588 10/05/2009 2009 354140 2755130  F 06/11/1933  F  Q
8  7291153 07/07/2010 2010 354140 2755130  H 29/09/1943  F  R
9  5663206 05/11/2010 2010 354140 2755130  I 31/08/1939  M  S
10 7240738 05/10/2011 2011 354140 2755130  J 03/10/1977  F  T
11 7798961 08/02/2012 2012 354140 2755130  K 02/10/1963  M  U
12 7798962 09/02/2012 2012 354140 2755130  K 02/10/1963  M  U

I need to subset this dataframe by removing all unique rows and keeping all repeated ones. This includes the "original" row, that is, the first appearance, and not only its repetitions along the dataframe.

I came close to get the desired dataframe object by using:

ed[duplicated(ed[,c('C6','C7','C8','C9')]),]

However, it omits the first appearance, something that makes sense since the first appearance is not a duplicate, and is not captured by the duplicate function.

I also tried:

ed[!unique(ed[,c('C6','C7','C8','C9')]),]

But it does not work either.

Are you looking for `ed[duplicated(ed[,c('C6','C7','C8','C9')]) | duplicated(ed[,c('C6','C7','C8','C9')], fromLast= TRUE),]` ? — David Arenburg, Mar 05 '15 at 21:06

Subset dataframe by removing all unique rows

0 Answers0