Comparing Multiple Lists Python

Question

I'm trying to compare multiple lists. However the lists aren't label...normally. I'm using a while loop to make a new list each time and label them accordingly. So for example, if the while loop runs 3 times it will make a List1 a List2 and List3. Here is then snippet of the code to create the list.

for link in links:
    print('*', link.text)
    locals()['list{}'.format(str(i))].append(link.text)

So I want to compare each list for the strings that are in them but I want to compare all the lists at once then print out the common strings.

I feel like I'll be using something like this, but I'm not 100% sure.

lists = [list1, list2, list3, list4, list5, list6, list7, list8, list9, list10]
common = list(set().union(*lists).intersection(Keyword))

This whole anonymous `locals()` business is yucky. There is no need for it. If you really needed to have labels associated with something, using a `dict`, but this just looks like you are creating a list of lists — jdi, Apr 05 '13 at 03:32
I also don't understand the use for a label at all. What do you need an arbitrary "listN" label? — jdi, Apr 05 '13 at 03:43

score 3 · Answer 1 · edited May 23 '17 at 12:31

3

Rather than directly modifying locals() (generally not a good idea), use a defaultdict as a container. This data structure allows you to create new key-value pairs on the fly rather than relying on a method which is sure to lead to a NameError at some point.

from collections import defaultdict

i = ...

link_lists = defaultdict(list)
for link in links:
    print('*', link.text)
    link_lists[i].append(link.text)

To find the intersection of all of the lists:

all_lists = list(link_lists.values())
common_links = set(all_lists[0]).intersection(*all_lists[1:])

In Python 2.6+, you can pass multiple iterables to set.intersection. This is what the star-args do here.

Here's an example of how the intersection will work:

>>> from collections import defaultdict
>>> c = defaultdict(list)
>>> c[9].append("a")
>>> c[0].append("b")
>>> all = list(c.values())
>>> set(all[0]).intersection(*all[1:])
set()
>>> c[0].append("a")
>>> all = list(c.values())
>>> set(all[0]).intersection(*all[1:])
{'a'}

edited May 23 '17 at 12:31

Community

1
1

answered Apr 05 '13 at 02:52

Jon Gauthier

25,202
6
63
69

I'm using 3.3 and I keep getting the error TypeError: 'dict_values' object does not support indexing. It's coming from the common_links part. – user1985351 Apr 06 '13 at 16:32
@user1985351: Just updated the code. Python 3's `.values()` returns a view of the dictionary values rather than a snapshot of its values at the time of the `.values()` call. We can convert this view object to a list using `list`. – Jon Gauthier Apr 06 '13 at 16:43
Ok the error is gone. Now running the program, when I call `print(common_links)` it returns only `set()` – user1985351 Apr 06 '13 at 16:59
Are there any links whose text actually exists in all categories? I've tested this code locally with some dummy data and it works fine. – Jon Gauthier Apr 06 '13 at 17:21
I just added an example. Is this what you want the code to be doing? – Jon Gauthier Apr 06 '13 at 17:47

Juan Carlos Moreno · Answer 2 · 2013-04-05T03:39:21.763

You have several options,

option a)

use itertools to get a cartesian product, this is quite nice because its an iterator

a = ["A", "B", "C"]
b = ["A","C"]
c = ["C","D","E"]

for aval,bval,cval in itertools.product(a,b,c):
    if aval == bval and bval == cval:
        print aval

option b)

Use sets (recommended):

 all_lists = []
 # insert your while loop X times
 for lst in lists:         # This is my guess of your loop running.
     currentList = map(lambda x: x.link, links)
     all_lists.append(currentList) # O(1) operation

 result_set = set()
 if len(all_lists)>1:
      result_set = set(all_lists[0]).intersection(*all_lists[1:]) 
 else:
      result_set = set(all_lists[0])

Using the sets, however, will be faster

Comparing Multiple Lists Python

2 Answers2