Remove the rows with duplicate entries in R

Question

I have a df like this:

Num <- c(1,1,1,2,2,3,4,5)
ID <- c("A","B","C","A","B","C","D","E")
dff <- data.frame(Num,ID)

I am trying to remove any rows that have duplicate entries. I am doing this way

dff1 <- dff[!duplicated(dff[,1]),]

I get the output

But my desired output is

  Num ID
6   3  C
7   4  D
8   5  E

What am i missing here?

You are 'missing' that duplicated returns a vector that is 'F' for the first occurence of an element, and T for each occurence after that. — Heroka, Dec 16 '15 at 19:40

score 2 · Accepted Answer · answered Dec 16 '15 at 19:39

2

You could try:

dff[dff$Num %in% as.numeric(names(table(dff$Num)==1)[table(dff$Num)==1]),]
  Num ID
6   3  C
7   4  D
8   5  E

Or using dplyr

library(dplyr)
dff %>% group_by(Num) %>% filter(n()==1)
Source: local data frame [3 x 2]
Groups: Num

  Num ID
1   3  C
2   4  D
3   5  E

answered Dec 16 '15 at 19:39

DatamineR

Nice dplyr solution. I like this. Thanks @DatamineR – Sharath Dec 16 '15 at 19:43
@Sharath you are welcome :-) – DatamineR Dec 16 '15 at 19:51

1 Answers1