0

I have a df like this:

Num <- c(1,1,1,2,2,3,4,5)
ID <- c("A","B","C","A","B","C","D","E")
dff <- data.frame(Num,ID)

I am trying to remove any rows that have duplicate entries. I am doing this way

dff1 <- dff[!duplicated(dff[,1]),]

I get the output

  Num ID
1   1  A
4   2  A
6   3  C
7   4  D
8   5  E

But my desired output is

  Num ID
6   3  C
7   4  D
8   5  E

What am i missing here?

Sharath
  • 2,225
  • 3
  • 24
  • 37
  • You are 'missing' that duplicated returns a vector that is 'F' for the first occurence of an element, and T for each occurence after that. – Heroka Dec 16 '15 at 19:40
  • 1
    `dff[!(dff$Num %in% dff$Num[duplicated(dff$Num)]),]` – RHertel Dec 16 '15 at 19:52

1 Answers1

2

You could try:

dff[dff$Num %in% as.numeric(names(table(dff$Num)==1)[table(dff$Num)==1]),]
  Num ID
6   3  C
7   4  D
8   5  E

Or using dplyr

library(dplyr)
dff %>% group_by(Num) %>% filter(n()==1)
Source: local data frame [3 x 2]
Groups: Num

  Num ID
1   3  C
2   4  D
3   5  E
DatamineR
  • 10,428
  • 3
  • 25
  • 45