-1

"I'm working on a data set and would like to extract all non-repeating records into a new data-set. My current data-set have both duplicate and non-duplicate records. Is there any way to get only the unique records?

I have tried the below code which gave me all duplicate records.

unique <- Data[!duplicated(Data),]
NewData <- unique[unique$x %in% unique$x[duplicated(unique$x)],]

For example I have taken the below data-set

x <- c("A","B","B","D","A","C","B","A")
y <- c(1,2,3,4,5,6,7,8)
z <- c(8,7,6,5,4,3,2,1)
Data <- data.frame(x,y,z)

Dataframe:

x y z
A 1 8
B 2 7
B 3 6
D 4 5
A 5 4
C 6 3
B 7 2
A 8 1

What I want is:

x y z
D 4 5
C 6 3
Shank
  • 17
  • 3

2 Answers2

4

Here's an R base alternative

> Data[Data$x %in% as.character(unique(Data$x)) [table(Data$x)==1], ]
  x y z
4 D 4 5
6 C 6 3

Or with dplyr

Data %>%
  group_by(x) %>%
  filter(n() == 1)
Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
  • For this data set shared by me, its complete working fine, but when m working on a different(larger) data-set, its not providing the exact result. – Shank Aug 01 '19 at 15:00
2

And using dplyr:

Data %>% count(x) %>% filter(n==1) %>% left_join(.,Data) %>% select(-n)

Joining, by = "x"
# A tibble: 2 x 3
  x         y     z
  <fct> <dbl> <dbl>
1 C         6     3
2 D         4     5
kstew
  • 1,104
  • 6
  • 21
  • 2
    You can also use `add_count` so you won't have to `join` after. Like this `Data %>% add_count(x) %>% filter(n == 1) %>% select(-n)` – AntoniosK Jul 30 '19 at 16:15