Remove all rows based on multiple columns

Question

Following basic example:

v1 <- c("a","b","c","a","b")
v2 <- c(1,2,3,1,1)
v3 <- rnorm(5,5) 

dat <- data.frame(cbind(v1,v2,v3))

I want to remove all rows with same value in v1 and v2.

To remove duplicated rows I can use

dat[!duplicated(dat[,c("v1","v2")]),]

   v1 v2 v3
1  a  1 6.48929449801677
2  b  2 4.89050807004701
3  c  3 5.57089903349316
5  b  1 4.08152834124853

But I want to remove the first row also.

Does anyone have a simple solution? Maybe some option in duplicated which I was not able to identify.

See [this post](http://stackoverflow.com/questions/12495345/find-indices-of-duplicated-rows) — alexis_laz, Mar 22 '16 at 11:30

akrun · Accepted Answer · 2016-03-22T11:27:12.230

3

We can use the duplicated with fromLast=TRUE option to search for duplicates in the reverse direction and then use | to get all the duplicates. Negating the logical index gets only the index for unique rows which we subset later.

dat[!(duplicated(dat[,c("v1","v2")])|
     duplicated(dat[,c("v1", "v2")], fromLast=TRUE)),]

edited Mar 22 '16 at 11:27

answered Mar 22 '16 at 11:25

akrun

874,273
37
540
662

Remove all rows based on multiple columns

1 Answers1

Linked