Remove duplicated rows in dataframe in R

Question

I have a following problem:

My dataframe has a lot of columns. I would like to remove rows that have same values in column X, Y and Z.

See my dataframe:

A B C X Y Z
1 2 3 4 5 6
2 5 4 4 5 6

In the dataframe above I would like to delete the first row, because X, Y and Z are the same in both rows.

I tried this, but it returned me something different:

newtable <- df[!duplicated(df$X, df$Z, df$Z), ]

Thanks a lot!

How about `df[!duplicated(df[-(1:3)]),]`? – ThomasIsCoding Dec 10 '19 at 16:32 — ThomasIsCoding, Dec 10 '19 at 16:32

score 1 · Answer 1 · answered Dec 10 '19 at 16:31

According to ?duplicated, the usage is

duplicated(x, incomparables = FALSE, ...)

where

x- a vector or a data frame or an array or NULL.

i.e. it cannot take more than one argument for 'x'. An option is to subset the dataset columns and apply as x

df[!duplicated(df[c("X", "Y", "Z")]), ]

1 Answers1