1

Here is a data frame example from duplicates in multiple columns.

> df
  a  b c    d
1 1  2 A 1001
2 2  4 B 1002
3 3  6 B 1002
4 4  8 C 1003
5 5 10 D 1004
6 6 12 D 1004
7 7 13 E 1005
8 8 14 E 1006

And I want an output to keep the rows when column C and column D are both duplicated.

> df
  a  b c    d
1 2  4 B 1002
2 3  6 B 1002
3 5 10 D 1004
4 6 12 D 1004

I tried the following code but it returned only the second row with the duplicates.

> df[duplicated(df[c("c","d")]), ]
  a  b c    d
3 3  6 B 1002
6 6 12 D 1004

data

df = structure(list(a = c(1, 2, 3, 4, 5, 6, 7, 8), b = c(2, 4, 6, 
8, 10, 12, 13, 14), c = structure(c(1L, 2L, 2L, 3L, 4L, 4L, 5L, 
5L), .Label = c("A", "B", "C", "D", "E"), class = "factor"), 
    d = c(1001, 1002, 1002, 1003, 1004, 1004, 1005, 1006)), .Names = c("a", 
"b", "c", "d"), row.names = c(NA, -8L), class = "data.frame")
Lin Caijin
  • 599
  • 4
  • 10

0 Answers0