How to get a data-set with only non repeating records from a data-set which contains unique and duplicate records?

Question

"I'm working on a data set and would like to extract all non-repeating records into a new data-set. My current data-set have both duplicate and non-duplicate records. Is there any way to get only the unique records?

I have tried the below code which gave me all duplicate records.

unique <- Data[!duplicated(Data),]
NewData <- unique[unique$x %in% unique$x[duplicated(unique$x)],]

For example I have taken the below data-set

x <- c("A","B","B","D","A","C","B","A")
y <- c(1,2,3,4,5,6,7,8)
z <- c(8,7,6,5,4,3,2,1)
Data <- data.frame(x,y,z)

Dataframe:

x y z
A 1 8
B 2 7
B 3 6
D 4 5
A 5 4
C 6 3
B 7 2
A 8 1

What I want is:

x y z
D 4 5
C 6 3

Jilber Urbina · Accepted Answer · 2019-07-30T16:09:50.660

4

Here's an R base alternative

> Data[Data$x %in% as.character(unique(Data$x)) [table(Data$x)==1], ]
  x y z
4 D 4 5
6 C 6 3

Or with dplyr

Data %>%
  group_by(x) %>%
  filter(n() == 1)

edited Jul 30 '19 at 16:09

answered Jul 30 '19 at 16:01

Jilber Urbina

58,147
10
114
138

For this data set shared by me, its complete working fine, but when m working on a different(larger) data-set, its not providing the exact result. – Shank Aug 01 '19 at 15:00

score 2 · Answer 2 · answered Jul 30 '19 at 16:03

2

And using dplyr:

Data %>% count(x) %>% filter(n==1) %>% left_join(.,Data) %>% select(-n)

Joining, by = "x"
# A tibble: 2 x 3
  x         y     z
  <fct> <dbl> <dbl>
1 C         6     3
2 D         4     5

answered Jul 30 '19 at 16:03

kstew

1,104
6
21

2

You can also use `add_count` so you won't have to `join` after. Like this `Data %>% add_count(x) %>% filter(n == 1) %>% select(-n)` – AntoniosK Jul 30 '19 at 16:15

How to get a data-set with only non repeating records from a data-set which contains unique and duplicate records?

2 Answers2