In Python, how do you remove duplicates from one or multiple lists?

Question

For example, if I had:

a = ["apples", "bananas", "cucumbers", "bananas"]

How could I remove the duplicate "bananas" so that:

a = ["apples", "bananas", "cucumbers"]

Also, if I had:

a = ["apples", "bananas", "cucumbers"]

b = ["pears", "apples", "watermelons"]

How could I remove the duplicate "apples" from both lists so that:

a = ["bananas", "cucumbers"]

b = ["pears", "watermelons"]

kindall · Answer 1 · 2015-12-14T03:10:01.237

5

The set-based solutions don't retain the order of the items. The following will keep the items in order and delete all but the first occurrence of each, using an auxilary set to keep track of which items have already been seen.

seen = set()
a = [seen.add(item) or item for item in a if item not in seen]

If you want to reuse the same list object, you can do that this way:

seen = set()
a[:] = (seen.add(item) or item for item in a if item not in seen)

edited Dec 14 '15 at 03:10

answered Dec 14 '15 at 03:00

kindall

178,883
35
278
309

You may also want to modify the current list rather than replace it, in case he had more than 1 variable referencing the original one. try `aux = [...]; for i in range(len(a)): a.pop(); a += aux` – Dleep Dec 14 '15 at 03:06
I'd use a slice assignment for that. I'll add that as example. – kindall Dec 14 '15 at 03:09
oh yeah, good old `a[::]` – Dleep Dec 14 '15 at 03:10
the OP never mentioned retaining order. – Corey Goldberg Dec 14 '15 at 03:16
3

True, but he didn't mention it wasn't important either. If it is important, here's a solution. – kindall Dec 14 '15 at 03:32

macabeus · Answer 2 · 2015-12-14T03:00:36.783

3

Use built-in functions set

a = ["apples", "bananas", "cucumbers", "bananas"]
a = list(set(a))
print(a)

In second case, use list comprehension

a = ["apples", "bananas", "cucumbers"]
b = ["pears", "apples", "watermelons"]

r = [i for i in a if i not in b] + [i for i in b if i not in a]    
print(r)

edited Dec 14 '15 at 03:00

answered Dec 14 '15 at 02:54

macabeus

4,156
5
37
66

Erik Godard · Answer 3 · 2015-12-14T03:03:02.313

The key to doing this is using Python's set.

In Python, a set is a data structure in which every item is unique.
If you call set(list), with a list as a parameter, you will get a set that contains all of the elements in list, with the duplicates removed
You can then convert this back into a list by calling list().

So, in your first example, you can write

a = list(set(a))

There are a couple of other methods in set that are useful.

Intersection - Calling set1.intersection(set2) returns a set with all of the objects that are in both set1 and set2.
Difference - Calling set1.difference(set2) returns a set with all of the elements in set1 that are not in set2.

So, in your second example, you can write

set1 = set(a).intersection(set(b)) #Get elements that are in both lists
set2 = set(a).difference(set1) #Get a set elements that are in a but not in b
a = list(set2) #Convert back to a list

along these same lines... you can also use a set comprehension to build a new unique list. {x for x in a} — Corey Goldberg, Dec 14 '15 at 03:33
The `set(b)` is not needed, the point of `intersection` etc.. is you can pass any iterable — Padraic Cunningham, Dec 15 '15 at 01:06

score 1 · Answer 4 · answered Dec 14 '15 at 02:55

1

You can just use set():

a = ["apples", "bananas", "cucumbers", "bananas"]

print list(set(a))

answered Dec 14 '15 at 02:55

Simon

9,762
15
62
119

score 0 · Answer 5 · answered Dec 14 '15 at 03:08

You can use a set object to record the duplicate elements. Like this:

def handle_dumplicate(*lsts):
    s = set()
    result = []
    for lst in lsts:
        no_dump_lst = []
        for ele in lst:
            if ele in s:
                continue
            s.add(ele)
            no_dump_lst.append(ele)
        result.append(no_dump_lst)
    return result

a = ["apples", "bananas", "cucumbers"]
b = ["pears", "apples", "watermelons"]

a, b = handle_dumplicate(a, b)
print a
print b

In Python, how do you remove duplicates from one or multiple lists?

5 Answers5