4

I was reading this: Combining two lists and removing duplicates, without removing duplicates in original list but my need goes beyond. I have at least 30 lists and I need the union without duplicates of all the lists. Right now my first try was just to use + to just append all the member in one great list and then use set to remove duplicates, but I'm not sure if this is the best solution:

Edit - Adding samples:

list_a = ['abc','bcd','dcb']
list_b = ['abc','xyz','ASD']
list_c = ['AZD','bxd','qwe']
big_list = list_a + list_b + list_c
print list(set(big_list)) # Prints ['abc', 'qwe', 'bcd', 'xyz', 'dcb', 'ASD', 'bxd']

My real question is if this the best way to go with this combination?

X3MBoy
  • 203
  • 4
  • 12

4 Answers4

3

If I understand correctly what you are trying to do, you can use the set.update method with an arbitrary number of iterable arguments.

>>> lists = [[1,2,3], [3,4,5], [5,6,7]]
>>> result = set()
>>> result.update(*lists)
>>> 
>>> result
{1, 2, 3, 4, 5, 6, 7}

edit: with your sample data:

>>> list_a = ['abc','bcd','dcb']
>>> list_b = ['abc','xyz','ASD']
>>> list_c = ['AZD','bxd','qwe']
>>> 
>>> result = set()
>>> result.update(list_a, list_b, list_c)
>>> result
{'ASD', 'xyz', 'qwe', 'bxd', 'AZD', 'bcd', 'dcb', 'abc'}
timgeb
  • 76,762
  • 20
  • 123
  • 145
  • 1
    This works perfectly, the only thing I add to this was to use `print(list(result))` because only having `print(result)` give me `set(['abc', 'qwe', 'bcd', 'xyz', 'dcb', 'ASD', 'bxd'])` on screen. – X3MBoy Jan 02 '18 at 15:22
2

Use set.union(set1, set2, set3, ..).

>>> l1 = [1,2,3]
>>> l2 = [2,3,4]
>>> l3 = [3,4,5]
>>> set.union(*[set(x) for x in (l1, l2, l3)])
{1, 2, 3, 4, 5}

More compact (works for both Py2 and Py3, Thanks @Lynn!):

>>> set.union(*map(set, (l1, l2, l3)))
set([1, 2, 3, 4, 5])
UltraInstinct
  • 43,308
  • 12
  • 81
  • 104
  • 1
    Both of your snippets actually work for both Python 2 and 3. Your second snippet is missing a `)`. – Lynn Jan 02 '18 at 18:59
1

One approach using set.union has already been mentioned, although applied onto each list after first mapping the lists to set instances.

As an alternative, the explicit set mapping can be omitted, as set.union, much like set.update (the latter approach covered in the accepted answer) also takes arbitrary number of iterable arguments, allowing directly invoking set.union over an empty set and the provided lists.

>>> list_a = ['abc','bcd','dcb']
>>> list_b = ['abc','xyz','ASD']
>>> list_c = ['AZD','bxd','qwe']

>>> result = set().union(list_a, list_b, list_c)
>>> result
{'ASD', 'xyz', 'qwe', 'bxd', 'AZD', 'bcd', 'dcb', 'abc'}
dfrib
  • 70,367
  • 12
  • 127
  • 192
0

What you could do is create a function that accepts any number of lists, flattens them and returns the union:

from itertools import chain

def union_lists(*iterables):
    union = []
    lookup = set()

    flattened = chain.from_iterable(iterables)

    for item in flattened:
        if item not in lookup:
            lookup.add(item)
            union.append(item)

    return union

The benefit of the above function is that it preserves the order of the list items as they are inserted, unlike a set(), which is unordered. It however uses a set() for checking if items have been added, which is O(1), but inserts them into a list instead, since lists are ordered.

It also flattens the list with itertools.chain.from_iterable, which is O(n).

Then you can simply run this function on as many lists as you want:

>>> list_a = ['abc','bcd','dcb']
>>> list_b = ['abc','xyz','ASD']
>>> list_c = ['AZD','bxd','qwe']
>>> print(union_lists(list_a, list_b, list_c))
['abc', 'bcd', 'dcb', 'xyz', 'ASD', 'AZD', 'bxd', 'qwe']
>>> list_d = ['bcd', 'AGF', 'def']
>>> print(union_lists(list_a, list_b, list_c, list_d))
['abc', 'bcd', 'dcb', 'xyz', 'ASD', 'AZD', 'bxd', 'qwe', 'AGF', 'def']
RoadRunner
  • 25,803
  • 6
  • 42
  • 75