0

I have a following problem:

My dataframe has a lot of columns. I would like to remove rows that have same values in column X, Y and Z.

See my dataframe:

A B C X Y Z
1 2 3 4 5 6
2 5 4 4 5 6

In the dataframe above I would like to delete the first row, because X, Y and Z are the same in both rows.

I tried this, but it returned me something different:

newtable <- df[!duplicated(df$X, df$Z, df$Z), ]

Thanks a lot!

rama27
  • 115
  • 1
  • 1
  • 6

1 Answers1

1

According to ?duplicated, the usage is

duplicated(x, incomparables = FALSE, ...)

where

x- a vector or a data frame or an array or NULL.

i.e. it cannot take more than one argument for 'x'. An option is to subset the dataset columns and apply as x

df[!duplicated(df[c("X", "Y", "Z")]), ]
akrun
  • 874,273
  • 37
  • 540
  • 662