Python: Compare elements in a list to each other

Question

I'm currently looking for a way to compare elements of a list to one another going from left to right.

Here is my list:

mylist = [[15], [14, 15], [19, 20], [13], [3], [65, 19], [19, 20, 31]]

I need to compare the first element to all others and check if any of the values match, same thing for the second and third element and so on until it reaches the end of the list. If there is a match, I need to print it out. I need to print out the value that has match and the index that it matched at.

For example.

index 0 matched with index 1 for value 15.
index 2 matched with index 5 for value 19
index 2 matched with index 6 for value 19
index 2 matched with index 6 for value 20
index 5 matched with index 6 for value 19.

How can I go about doing this?

@kroolik, thanks for the quick reply. I was thinking maybe I should convert each element in the list to a set. And then compare each set to each other using intersect. If there is a matching value, then I'd print the match value(s) and the index at which it was matched at. What do you think of that approach? Am I far off? or close? Thanks for your help. — deadlock, Feb 22 '14 at 18:11
@kroolik, perhaps this might help me? http://stackoverflow.com/questions/16603282/how-to-compare-each-item-in-a-list-with-the-rest-only-once — deadlock, Feb 22 '14 at 18:13
You can try the @arshajii 's solution - very efficient, easy to implement, and pretty straightforward to understand. — Maciej Gol, Feb 22 '14 at 18:21
@kroolik I'm going to try that out, also take a look at zhangxaochen 's solution. Looks very clean! — deadlock, Feb 22 '14 at 18:27

arshajii · Answer 1 · 2014-02-22T18:36:33.950

The naive solution is to loop over every pair, which is slow. However, you can do something along the lines of this:

Create a dict that will map ints (elements in your nested list) to lists containing the indices of the lists in your master.
Loop over the master list, and for each sublist, add its index to the spot corresponding to each of its elements in the dict.
Now, the dict has values consisting of lists in which each two elements are a "pair".

This is what I mean:

>>> mylist = [[15], [14, 15], [19, 20], [13], [3], [65, 19], [19, 20, 31]]
>>> 
>>> pairs = dict()
>>> 
>>> 
>>> from collections import defaultdict
>>> 
>>> pairs = defaultdict(list)
>>> 
>>> for idx, sub in enumerate(mylist):
...     for a in sub:
...         pairs[a].append(idx)
... 
>>> pairs
defaultdict(<type 'list'>, {65: [5], 3: [4], 13: [3], 14: [1], 15: [0, 1], 19: [2, 5, 6], 20: [2, 6], 31: [6]})

You can disregard the value lists with only 1 element (since that means we don't have a pair). Those with 2+ elements form a variety of pairs. For instance with 19 we have [2, 5, 6] meaning:

2 and 5 are a pair
2 and 6 are a pair
5 and 6 are a pair

as you have. This approach is linear in the total number of items within your nested lists. If you have a reasonable upper bound on the nested list lengths, then this is O(n) in the size of your master list.

Now for the output:

>>> from itertools import combinations
>>> 
>>> for k,v in pairs.iteritems():
...     if len(v) > 1:
...         for a,b in combinations(v, 2):
...             print "Index {} matched with index {} for value {}".format(a, b, k)
... 
Index 0 matched with index 1 for value 15
Index 2 matched with index 5 for value 19
Index 2 matched with index 6 for value 19
Index 5 matched with index 6 for value 19
Index 2 matched with index 6 for value 20

@Chrinna See the update, which includes how to produce the desired output. Does order matter here? — arshajii, Feb 22 '14 at 18:36

score 3 · Answer 2 · answered Feb 22 '14 at 18:23

You can use itertools.combinations, it saves you from the nested-loop:

In [770]: l2=[(i,set(mylist[i])) for i in range(len(mylist))]
     ...: for (i, a), (j, b) in combinations(l2,2):
     ...:     for v in a.intersection(b):
     ...:         print "Index {} matched with index {} for value {}".format(i,j,v)
Index 0 matched with index 1 for value 15
Index 2 matched with index 5 for value 19
Index 2 matched with index 6 for value 19
Index 2 matched with index 6 for value 20
Index 5 matched with index 6 for value 19

score 2 · Accepted Answer · answered Feb 22 '14 at 18:14

2

mylist = [[15], [14, 15], [19, 20], [13], [3], [65, 19], [19, 20, 31]]
my_set_list = [set(item) for item in mylist]
for i1, item in enumerate(my_set_list):
    for i2 in xrange(i1 + 1, len(my_set_list)):
        for val in (item & my_set_list[i2]):
            print "Index {} matched with index {} for value {}".format(i1,i2,val)

Output

Index 0 matched with index 1 for value 15.
Index 2 matched with index 5 for value 19.
Index 2 matched with index 6 for value 19.
Index 2 matched with index 6 for value 20.
Index 5 matched with index 6 for value 19.

answered Feb 22 '14 at 18:14

thefourtheye

233,700
52
457
497

This will perform badly for large lists. – arshajii Feb 22 '14 at 18:23
@arshajii I think so too, but my list will only be at most 175 elements long. So I think this approach might be ok. – deadlock Feb 22 '14 at 18:25

Python: Compare elements in a list to each other

3 Answers3

Linked