Removing duplicates from list of lists by using list comprehension

Question

I was curious if you could remove duplicates from list of lists and return uniques as a list. I was trying this:

def do_list( lists ):
    res = [ [ one for one in temp if one not in res ] for temp in lists ]
    return res

So for example, if:

lists = [ [ "a","b","c" ],[ "d","a" ],[ "c","a","f" ] ]

the result should be:

[ "a","b,"c","d","f" ]

But it gives me error that I refference variable res before assigment.

score 6 · Answer 1 · answered Mar 13 '16 at 01:40

6

You could do this:

set(itertools.chain.from_iterable(lists))

set will remove all duplicates, inside of the set is just flattening your list(s) to a single list.

answered Mar 13 '16 at 01:40

Pythonista

11,377
2
31
50

1

Or `set(itertools.chain(*lists))`, a bit fewer characters. – Mikhail Gerasimov Mar 13 '16 at 01:44
Thank you! So is it impossible to do by using just plain list comprehension? – TheTask1337 Mar 13 '16 at 01:51

score 2 · Accepted Answer · edited Oct 10 '18 at 04:31

You get an error because you're referencing res inside the comprehension. This doesn't work, since res is only available after the expression is finished.

As I'm a curious sort, and because the title asks "Removing duplicates from list of lists by using list comprehension", I wanted to see if you can do this using only a list comprehension, and not by cheating such as using itertools :p

And here's how:

>>> lists = [ [ "a","b","c" ],[ "d","a" ],[ "c","a","f" ] ]
>>> lists2 = sorted(sum(lists, []))
>>> [ item for i, item in enumerate(lists2) if i == 0 or i == len(lists2) or lists2[i - 1] != item ]
['a', 'b', 'c', 'd', 'f']

For more insanity, you can combine them on a single line, but you'd have to repeat the sum() and sorted() calls. I couldn't move myself to write such ugly code ;-)

sum(lists, []) will flatten the list; it returns the sum (+ operator) of all the items in lists, with [] as the initial list.
sorted() will sort it. This is needed since we only check against the last item
the if statement checks if the previous item is the same as the current item.

But it's ugly and un-Pythonic. For the love of Guido, use Pythonista's answer (or some variation thereof)!

@TheTask1337 Not sure if "beautiful" is the right word to describe this sort of code ;-) It looks like the sort of thing a Perl programmer would do! :p Like I said, it's merely an educational exercise. Please don't actually use it ;-) — Martin Tournoij, Mar 13 '16 at 02:07

zondo · Answer 3 · 2016-03-13T02:05:40.580

1

res isn't created until the entire list comprehension has been evaluated. You can use a set to remove duplicates:

res = list(set(sum(lists, [])))

If you want it sorted:

res = sorted(set(sum(lists, [])))

If you want it ordered exactly how it comes, a list comprehension is probably not the best way. Instead, do this:

res = []
for temp in lists:
    res.append([])
    for one in temp:
        if one not in res[-1]:
            res[-1].append(one)

edited Mar 13 '16 at 02:05

answered Mar 13 '16 at 01:43

zondo

19,901
8
44
83

IMHO using `sum(lists, [])` is much more obvious way to create a flattened list... – Martin Tournoij Mar 13 '16 at 02:02
@Carpetsmoker: It may be obvious, but maybe I don't have a substantial amount of intelligence. Thanks for mentioning it. – zondo Mar 13 '16 at 02:07

Removing duplicates from list of lists by using list comprehension

3 Answers3