4

I'm currently looking for a way to compare elements of a list to one another going from left to right.

Here is my list:

mylist = [[15], [14, 15], [19, 20], [13], [3], [65, 19], [19, 20, 31]]

I need to compare the first element to all others and check if any of the values match, same thing for the second and third element and so on until it reaches the end of the list. If there is a match, I need to print it out. I need to print out the value that has match and the index that it matched at.

For example.

index 0 matched with index 1 for value 15.
index 2 matched with index 5 for value 19
index 2 matched with index 6 for value 19
index 2 matched with index 6 for value 20
index 5 matched with index 6 for value 19.

How can I go about doing this?

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
deadlock
  • 7,048
  • 14
  • 67
  • 115
  • @kroolik, thanks for the quick reply. I was thinking maybe I should convert each element in the list to a set. And then compare each set to each other using intersect. If there is a matching value, then I'd print the match value(s) and the index at which it was matched at. What do you think of that approach? Am I far off? or close? Thanks for your help. – deadlock Feb 22 '14 at 18:11
  • @kroolik, perhaps this might help me? http://stackoverflow.com/questions/16603282/how-to-compare-each-item-in-a-list-with-the-rest-only-once – deadlock Feb 22 '14 at 18:13
  • You can try the @arshajii 's solution - very efficient, easy to implement, and pretty straightforward to understand. – Maciej Gol Feb 22 '14 at 18:21
  • @kroolik I'm going to try that out, also take a look at zhangxaochen 's solution. Looks very clean! – deadlock Feb 22 '14 at 18:27

3 Answers3

4

The naive solution is to loop over every pair, which is slow. However, you can do something along the lines of this:

  • Create a dict that will map ints (elements in your nested list) to lists containing the indices of the lists in your master.
  • Loop over the master list, and for each sublist, add its index to the spot corresponding to each of its elements in the dict.
  • Now, the dict has values consisting of lists in which each two elements are a "pair".

This is what I mean:

>>> mylist = [[15], [14, 15], [19, 20], [13], [3], [65, 19], [19, 20, 31]]
>>> 
>>> pairs = dict()
>>> 
>>> 
>>> from collections import defaultdict
>>> 
>>> pairs = defaultdict(list)
>>> 
>>> for idx, sub in enumerate(mylist):
...     for a in sub:
...         pairs[a].append(idx)
... 
>>> pairs
defaultdict(<type 'list'>, {65: [5], 3: [4], 13: [3], 14: [1], 15: [0, 1], 19: [2, 5, 6], 20: [2, 6], 31: [6]})

You can disregard the value lists with only 1 element (since that means we don't have a pair). Those with 2+ elements form a variety of pairs. For instance with 19 we have [2, 5, 6] meaning:

  • 2 and 5 are a pair
  • 2 and 6 are a pair
  • 5 and 6 are a pair

as you have. This approach is linear in the total number of items within your nested lists. If you have a reasonable upper bound on the nested list lengths, then this is O(n) in the size of your master list.

Now for the output:

>>> from itertools import combinations
>>> 
>>> for k,v in pairs.iteritems():
...     if len(v) > 1:
...         for a,b in combinations(v, 2):
...             print "Index {} matched with index {} for value {}".format(a, b, k)
... 
Index 0 matched with index 1 for value 15
Index 2 matched with index 5 for value 19
Index 2 matched with index 6 for value 19
Index 5 matched with index 6 for value 19
Index 2 matched with index 6 for value 20
arshajii
  • 127,459
  • 24
  • 238
  • 287
3

You can use itertools.combinations, it saves you from the nested-loop:

In [770]: l2=[(i,set(mylist[i])) for i in range(len(mylist))]
     ...: for (i, a), (j, b) in combinations(l2,2):
     ...:     for v in a.intersection(b):
     ...:         print "Index {} matched with index {} for value {}".format(i,j,v)
Index 0 matched with index 1 for value 15
Index 2 matched with index 5 for value 19
Index 2 matched with index 6 for value 19
Index 2 matched with index 6 for value 20
Index 5 matched with index 6 for value 19
zhangxaochen
  • 32,744
  • 15
  • 77
  • 108
2
mylist = [[15], [14, 15], [19, 20], [13], [3], [65, 19], [19, 20, 31]]
my_set_list = [set(item) for item in mylist]
for i1, item in enumerate(my_set_list):
    for i2 in xrange(i1 + 1, len(my_set_list)):
        for val in (item & my_set_list[i2]):
            print "Index {} matched with index {} for value {}".format(i1,i2,val)

Output

Index 0 matched with index 1 for value 15.
Index 2 matched with index 5 for value 19.
Index 2 matched with index 6 for value 19.
Index 2 matched with index 6 for value 20.
Index 5 matched with index 6 for value 19.
thefourtheye
  • 233,700
  • 52
  • 457
  • 497