Removing duplicates according one column

Question

My dummy data looks like this:

> head(dummy)
            C1          C2
[1,]         1           1
[2,]         1           2
[3,]         1           3
[4,]         2           3
[5,]         2           4
[6,]         2           5

Value 3 is duplicated in C2, but those lines are unique in data frame. I want to remove all duplicates according C2 and keep only first/last occurrence according C1.

Example of what I want:

> remove duplicates leave first in C1
            C1          C2
[1,]         1           1
[2,]         1           2
[3,]         1           3
[5,]         2           4
[6,]         2           5
# filtered    [4,]   2    3

Or

> remove duplicates leave first in C1
            C1          C2
[1,]         1           1
[2,]         1           2
[4,]         2           3
[5,]         2           4
[6,]         2           5
# filtered   [3,]   1    3

score 1 · Accepted Answer · answered Jun 13 '14 at 08:54

1

if dat is the dataset

dat[with(dat, !duplicated(C2)),]
 C1 C2
1  1  1
2  1  2
3  1  3
5  2  4
6  2  5


dat[with(dat, !duplicated(C2,fromLast=TRUE)),]

answered Jun 13 '14 at 08:54

akrun

874,273
37
540
662

Removing duplicates according one column

1 Answers1