Remove duplicated elements from list

Question

I have a list of character vectors:

my.list <- list(e1 = c("a","b","c","k"),e2 = c("b","d","e"),e3 = c("t","d","g","a","f"))

And I'm looking for a function that for any character that appears more than once across the list's vectors (in each vector a character can only appear once), will only keep the first appearance.

The result list for this example would therefore be:

res.list <- list(e1 = c("a","b","c","k"),e2 = c("d","e"),e3 = c("t","g","f"))

Note that it is possible that an entire vector in the list is eliminated so that the number of elements in the resulting list doesn't necessarily have to be equal to the input list.

score 14 · Accepted Answer · answered Jul 26 '17 at 05:21

14

We can unlist the list, get a logical list using duplicated and extract the elements in 'my.list' based on the logical index

un <- unlist(my.list)
res <- Map(`[`, my.list, relist(!duplicated(un), skeleton = my.list))
identical(res, res.list)
#[1] TRUE

answered Jul 26 '17 at 05:21

akrun

874,273
37
540
662

score 4 · Answer 2 · answered Jul 26 '17 at 13:48

Here is an alternative using mapply with setdiff and Reduce.

# make a copy of my.list
res.list <- my.list
# take set difference between contents of list elements and accumulated elements
res.list[-1] <- mapply("setdiff", res.list[-1],
                                  head(Reduce(c, my.list, accumulate=TRUE), -1))

Keeping the first element of the list, we compute on subsequent elements and the a list of the cumulative vector of elements produced by Reduce with c and the accumulate=TRUE argument. head(..., -1) drops the final list item containing all elements so that the lengths align.

This returns

res.list
$e1
[1] "a" "b" "c" "k"

$e2
[1] "d" "e"

$e3
[1] "t" "g" "f"

Note that in Reduce, we could replace c with function(x, y) unique(c(x, y)) and accomplish the same ultimate output.

score 1 · Answer 3 · answered Jul 19 '21 at 08:21

I found the solutions here very complex for my understanding and sought a simpler technique. Suppose you have the following list.

my_list <- list(a = c(1,2,3,4,5,5), b = c(1,2,2,3,3,4,4), 
                
                d = c("Mary", "Mary", "John", "John"))

The following much simpler piece of code removes the duplicates.

sapply(my_list, unique)

You will end up with the following.

$a
[1] 1 2 3 4 5

$b
[1] 1 2 3 4

$d
[1] "Mary" "John"

There is beauty in simplicity!

This is not what was asked by the OP. – KarthikS Mar 01 '22 at 20:17 — KarthikS, Mar 01 '22 at 20:17

Remove duplicated elements from list

3 Answers3

Linked