63

I have a single list of numeric vector and I want to combine them into one vector. But I am unable to do that. This list can have one element common across the list element. Final vector should not add them twice. Here is an example:

>lst
`1`
[1] 1 2
`2`
[2] 2 4 5
`3`
[3] 5 9 1

I want final result as this

>result
[1] 1 2 4 5 9 1

I tried doing following things, without worrying about the repition:

>vec<-vector()
>sapply(lst, append,vec)

and

>vec<-vector()
>sapply(lst, c, vec)

None of them worked. Can someone help me on this?

Thanks.

Rachit Agrawal
  • 3,203
  • 10
  • 32
  • 56
  • Thanks @JoshO'Brien. But that doesn't remove the duplicate values. – Rachit Agrawal Mar 20 '13 at 04:36
  • @joran I doubt `unique` will be fine-grained enough; `unique` could quite easily remove more than the 1 common element between *adjacent* list components. Note `unique(unlist(lst))` wouldn't give what the OP wants. – Gavin Simpson Mar 20 '13 at 04:38
  • 3
    Are you just saying you don't want any repeated values right next to each other? Or are you saying you just don't want to repeat an element if the end of one vector matches the beginning of the next? Providing more examples could help... – Dason Mar 20 '13 at 04:38
  • 2
    @JoshO'Brien `unique()` would strip one of the `1`s which the OP claims should be in the output. – Gavin Simpson Mar 20 '13 at 04:39
  • 2
    This works, but I'm not sure if it wouldn't work if it had repeated values inside a list element: `unique(do.call(c, lst))`. According to the gospel of @MatthewLundberg, `rle(do.call(c, lst))$values`. Based on my benchmark, Matthew's solution is faster. – Roman Luštrik Mar 20 '13 at 07:20

6 Answers6

59

A solution that is faster than the one proposed above:

vec<-unlist(lst)
vec[which(c(1,diff(vec)) != 0)]
MartijnVanAttekum
  • 1,405
  • 12
  • 20
Rachit Agrawal
  • 3,203
  • 10
  • 32
  • 56
  • 6
    What is `vec[which(c(1,diff(vec)) != 0)]` for ? – Galaxy Aug 07 '15 at 06:41
  • 4
    but is it faster than the one proposed above? – hedgedandlevered Jan 14 '16 at 17:15
  • 2
    @Galaxy this is to remove consecutive repeats while keeping the repeated elements that are separated by other elements. `diff()` subtracts a previous value to the current one. if `diff(vec)` is equal to zero that means the current value and the previous one were the same and this value can be remove. For example using `lst <- list(c(1,2),c(2,4,5),c(5,9,1))` and `vec<-unlist(lst)`. `vec[which(c(1,diff(vec)) != 0)]` will remove all consecutive repeats, but it will keep the repeated one at the end. – Paul Rougieux Sep 04 '18 at 09:48
25

Another answer using Reduce().

Create the list of vectors:

lst <- list(c(1,2),c(2,4,5),c(5,9,1))

Combine them into one vector

vec <- Reduce(c,lst)
vec
# [1] 1 2 2 4 5 5 9 1

Keep the repeated ones only once:

unique(Reduce(c,lst))
#[1] 1 2 4 5 9

If you want to keep that repeated one at the end, You might want to use vec[which(c(1,diff(vec)) != 0)] as in @Rachid's answer

Paul Rougieux
  • 10,289
  • 4
  • 68
  • 110
8

You want rle:

rle(unlist(lst))$values

> lst <- list(`1`=1:2, `2`=c(2,4,5), `3`=c(5,9,1))
> rle(unlist(lst))$values
## 11 21 22 31 32 33 
##  1  2  4  5  9  1 
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
  • I was thinking this as well. The one problem I have is that I don't know if they would want to remove repeated values within a list element... – Dason Mar 20 '13 at 04:47
  • 2
    This achieve what I am trying to do. I could also do it with the following option: `vec<-unlist(lst); vec[which(c(1,diff(vec)) != 0)]` Now I am wondering which is better? – Rachit Agrawal Mar 20 '13 at 04:52
  • That's probably faster as it is doing less work (and is faster on your trivial example, on my machine). Look at the code for `rle`. You might add that as another answer. – Matthew Lundberg Mar 20 '13 at 04:58
  • @MatthewLundberg How did you compute time?? – Rachit Agrawal Mar 20 '13 at 10:00
7

stack will do this nicely too, and looks more concise:

stack(lst)$values
0mn1
  • 136
  • 2
  • 2
4

Benchmarking the two answers by Rachit and Martijn

rbenchmark::benchmark(
  "unlist" = {
    vec<-unlist(a)
    vec[which(diff(vec) != 0)]
  },
  "reduce" = {
    a %>% reduce(c) %>% unique
  }
)

Output:

    test replications elapsed relative user.self sys.self user.child sys.child
2 reduce          100   0.036        3     0.036    0.000          0         0
1 unlist          100   0.012        1     0.000    0.004          0         0

This one clearly beat the other one.

Prradep
  • 5,506
  • 5
  • 43
  • 84
3

Doing it the tidyverse way:

library(tidyverse)
lst %>% reduce(c) %>% unique

This uses the (uncapitalized) reduce version from purrr in combination with pipes. Also note that if the list contains named vectors, the final naming will be different depending on whether unlist or reduce methods are used.

MartijnVanAttekum
  • 1,405
  • 12
  • 20