5

I have a vector like this:

x = c(1,2,3,4,5,6,4,5,6,7)

> x
 [1] 1 2 3 4 5 6 4 5 6 7

I want to get rid of duplicates and get something like this:

> [1] 1 2 3 7 

My attempt

y = x[duplicated(x)]

> y
[1] 4 5 6

> x[x!=y]
[1] 1 2 3 7
Warning message:
In x != y : longer object length is not a multiple of shorter object length
> 

What am I doing wrong?

Is this error something I should worry about?

Is there another way to do this without getting an error?

Hugh
  • 15,521
  • 12
  • 57
  • 100
Zyferion
  • 149
  • 2
  • 11
  • 1
    Since `y` is a vector here, you need to use `%in%` operator. `x[!(x %in% y)]` – Psidom May 23 '16 at 01:00
  • Welcome to SO! Your question is fine, but contains a malapropism. In programming, a 'double' element usually refers to a number stored as a double precision floating-point, rather than a 'duplicate'. You''ll note that your original tag `double` refers to this sense of the word double, not the one you meant. https://en.wikipedia.org/wiki/Double-precision_floating-point_format – Hugh May 23 '16 at 01:09
  • 1
    @Psidom: Thank you, exactly what I needed! – Zyferion May 23 '16 at 01:12
  • @Hugh: Ah yes, i wasn't sure how to word my problem. Thank you for the useful tip - will keep that in mind! – Zyferion May 23 '16 at 01:12
  • Can I ask the context of this question? Partly I suspect an X-Y problem here. – Hugh May 23 '16 at 01:36
  • @Hugh: I wanted to get rid of rows in my data frame with duplicate dates – Zyferion May 23 '16 at 01:58
  • I swear this is a duplicate but I can't find it right now, but `x[ave(x,x,FUN=length)==1]` – thelatemail May 23 '16 at 03:12
  • @thelatemail, maybe [this one](http://stackoverflow.com/q/7854433/4408538) – Joseph Wood May 23 '16 at 15:10
  • @thelatemail, I found a couple more: [post1](http://stackoverflow.com/q/13763216/4408538), [post2](http://stackoverflow.com/q/37148567/4408538). – Joseph Wood May 23 '16 at 15:14

4 Answers4

5

Beware using consecutive numbers in your tests!

x <- c(1,2,3,4,5,6,4,5,6,7)
x1 <- c(-1, -1, 2, 8, 8, 15)

keep_singles <- function(v){
  v[!(v %in% v[duplicated(v)])] 
}

keep_singles(x)

[1] 1 2 3 7

keep_singles(x1)

[1]  2 15
Hugh
  • 15,521
  • 12
  • 57
  • 100
  • Ahhh was just going to post this answer in addition to mine, but you beat me to it! Definitely the cleaner way to do it. – Mike H. May 23 '16 at 01:36
4

A simple way to do it using base R that doesn't give you a warning message.

Edit: More flexible answer from @Hugh's suggestion

y = as.numeric(names(which(table(x)==1)))

y
[1] 1 2 3 7
Mike H.
  • 13,960
  • 2
  • 29
  • 39
  • Not quite correct. Using your method on `x <- c(-1, -1, 2, 8, 8, 15)` returns `2 4`. I think you want `x[which(table(x) == 1)]` (*i.e.* the values not the indices). – Hugh May 23 '16 at 01:15
  • Unfortunately trying that on the original problem I get: `1 2 3 4` – Mike H. May 23 '16 at 01:21
  • Sorry, my suggestion was wrong, to get the values it would be `as.numeric(names(table(x))[table(x) == 1])`. Unless OP wants the indices, I think your answer is currently incorrect. – Hugh May 23 '16 at 01:24
  • No worries, I should've stress tested my answer a little more - thanks for pointing that out! I think this works as well `y = as.numeric(names((which(table(x)==1))))`. – Mike H. May 23 '16 at 01:26
  • That's what comments are for! :-) – Hugh May 23 '16 at 01:36
2

Here is a way with duplicated alone

x[!(duplicated(x)|duplicated(x, fromLast=TRUE))]
#[1] 1 2 3 7
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Nice answer. I use a very similar function for stuff like this... `x[duplicated(x) + duplicated(x, fromLast = TRUE)==0]`. – Joseph Wood May 23 '16 at 03:00
0

Simple way as your code, but use [!x%in%y] rather than [x!=y]

x = c(1,2,3,4,5,6,4,5,6,7)

y <- x[duplicated(x)]

z <- x[!x%in%y]

print(z)
[1] 1 2 3 7
Batanichek
  • 7,761
  • 31
  • 49
Arun kumar mahesh
  • 2,289
  • 2
  • 14
  • 22