How to get rid of duplicated values in dataframe column

Question

My question is very similar to this one Subset with unique cases, based on multiple columns. The only difference is I don't want the duplicated value to show up in the final data frame. Original dataframe:

df
v1  v2  v3   v4  v5
1  7   1   A  100  98 
2  7   2   A   98  97
3  8   1   C   NA  80
4  8   1   C   78  75
5  8   1   C   50  62
6  9   3   C   75  75

using > df[!duplicated(df[1:3]),] gets me

  v1 v2 v3  v4 v5
1  7  1  A 100 98
2  7  2  A  98 97
3  8  1  C  NA 80
6  9  3  C  75 75

But what I would like is

  v1 v2 v3  v4 v5
1  7  1  A 100 98
2  7  2  A  98 97

6  9  3  C  75 75

I tried using unique but it seems it's just keeping the column I am analyzing. Any help would be greatly appreciated!

score 1 · Answer 1 · answered Oct 18 '17 at 10:32

1

We need to also get the duplicated from the other end

df[!(duplicated(df[1:3])|duplicated(df[1:3], fromLast = TRUE)),]

answered Oct 18 '17 at 10:32

akrun

874,273
37
540
662

How to get rid of duplicated values in dataframe column

1 Answers1