R: Removing duplicate elements in a vector

Question

I have a vector like this:

x = c(1,2,3,4,5,6,4,5,6,7)

> x
 [1] 1 2 3 4 5 6 4 5 6 7

I want to get rid of duplicates and get something like this:

> [1] 1 2 3 7

My attempt

y = x[duplicated(x)]

> y
[1] 4 5 6

> x[x!=y]
[1] 1 2 3 7
Warning message:
In x != y : longer object length is not a multiple of shorter object length
>

What am I doing wrong?

Is this error something I should worry about?

Is there another way to do this without getting an error?

Since `y` is a vector here, you need to use `%in%` operator. `x[!(x %in% y)]` — Psidom, May 23 '16 at 01:00
Welcome to SO! Your question is fine, but contains a malapropism. In programming, a 'double' element usually refers to a number stored as a double precision floating-point, rather than a 'duplicate'. You''ll note that your original tag `double` refers to this sense of the word double, not the one you meant. https://en.wikipedia.org/wiki/Double-precision_floating-point_format — Hugh, May 23 '16 at 01:09
@Hugh: Ah yes, i wasn't sure how to word my problem. Thank you for the useful tip - will keep that in mind! — Zyferion, May 23 '16 at 01:12
Can I ask the context of this question? Partly I suspect an X-Y problem here. — Hugh, May 23 '16 at 01:36
@Hugh: I wanted to get rid of rows in my data frame with duplicate dates — Zyferion, May 23 '16 at 01:58
I swear this is a duplicate but I can't find it right now, but `x[ave(x,x,FUN=length)==1]` — thelatemail, May 23 '16 at 03:12
@thelatemail, maybe [this one](http://stackoverflow.com/q/7854433/4408538) — Joseph Wood, May 23 '16 at 15:10
@thelatemail, I found a couple more: [post1](http://stackoverflow.com/q/13763216/4408538), [post2](http://stackoverflow.com/q/37148567/4408538). — Joseph Wood, May 23 '16 at 15:14

score 5 · Answer 1 · answered May 23 '16 at 01:33

5

Beware using consecutive numbers in your tests!

x <- c(1,2,3,4,5,6,4,5,6,7)
x1 <- c(-1, -1, 2, 8, 8, 15)

keep_singles <- function(v){
  v[!(v %in% v[duplicated(v)])] 
}

keep_singles(x)

[1] 1 2 3 7

keep_singles(x1)

[1]  2 15

answered May 23 '16 at 01:33

Hugh

15,521
12
57
100

Ahhh was just going to post this answer in addition to mine, but you beat me to it! Definitely the cleaner way to do it. – Mike H. May 23 '16 at 01:36

Mike H. · Answer 2 · 2016-05-23T01:38:34.733

4

A simple way to do it using base R that doesn't give you a warning message.

Edit: More flexible answer from @Hugh's suggestion

y = as.numeric(names(which(table(x)==1)))

y
[1] 1 2 3 7

edited May 23 '16 at 01:38

answered May 23 '16 at 01:05

Mike H.

13,960
2
29
39

Not quite correct. Using your method on `x <- c(-1, -1, 2, 8, 8, 15)` returns `2 4`. I think you want `x[which(table(x) == 1)]` (*i.e.* the values not the indices). – Hugh May 23 '16 at 01:15
Unfortunately trying that on the original problem I get: `1 2 3 4` – Mike H. May 23 '16 at 01:21
Sorry, my suggestion was wrong, to get the values it would be `as.numeric(names(table(x))[table(x) == 1])`. Unless OP wants the indices, I think your answer is currently incorrect. – Hugh May 23 '16 at 01:24
No worries, I should've stress tested my answer a little more - thanks for pointing that out! I think this works as well `y = as.numeric(names((which(table(x)==1))))`. – Mike H. May 23 '16 at 01:26
That's what comments are for! :-) – Hugh May 23 '16 at 01:36

score 2 · Answer 3 · answered May 23 '16 at 01:58

2

Here is a way with duplicated alone

x[!(duplicated(x)|duplicated(x, fromLast=TRUE))]
#[1] 1 2 3 7

answered May 23 '16 at 01:58

akrun

874,273
37
540
662

1

Nice answer. I use a very similar function for stuff like this... `x[duplicated(x) + duplicated(x, fromLast = TRUE)==0]`. – Joseph Wood May 23 '16 at 03:00

score 0 · Answer 4 · edited May 23 '16 at 09:56

0

Simple way as your code, but use [!x%in%y] rather than [x!=y]

x = c(1,2,3,4,5,6,4,5,6,7)

y <- x[duplicated(x)]

z <- x[!x%in%y]

print(z)
[1] 1 2 3 7

edited May 23 '16 at 09:56

Batanichek

7,761
31
49

answered May 23 '16 at 06:07

Arun kumar mahesh

2,289
2
14
22

R: Removing duplicate elements in a vector

4 Answers4

Linked

Related