1

Disclaimer: I'm teaching myself Python, so there might be some trivial solution to each of my questions. Patience is appreciated!

I understand that the title is a little unclear, so I'll try to clarify with an example.

Suppose we have an array of transactions:

txArray=[[u'1'],[u'2'],[u'2', u'3']]

The goal is to write a function myIntersection(arrayOfLists) that first calculates the intersection of each possible pair of lists in txArray, and then takes the union.

So myIntersection(txArray) should return [u'2'], because:

int1=intersection([u'1'],[u'2'])=[]
int2=intersection([u'1'],[u'2', u'3'])=[]
int3=intersection([u'2'],[u'2', u'3'])=[u'2']

union=(int1 U int2 U int3)=[u'2']

What I've tried so far is below:

from itertools import combinations

'''
Pseudocode:
1) Generate all possible 2-combinations of the lists in txArray
2) Flatten the lists
3) If a value appears more than once in a 2-combination, add it to
list of intersections
4) Keep unique elements in list of intersections

'''

def myIntersection(arrayOfLists):
    flat_list=[]
    intersections=[]
    combs=list(combinations(txArray,2))
    for i in range(0, len(combs)):
        flat_list.append([item for sublist in combs[i] for item in sublist])
    for list in flat_list:
        for element in list:
            if list.count(element)>1:
                if element not in intersections:
                    intersections.append(element)
    return intersections

While it works in a python command-line interface, I keep getting errors with this approach when I save it as a python file and run it.

My questions are: 1) Why doesn't it work when I run it as a python file?

2) Is there a cleaner, more 'pythonic' way to do this (possibly with list comprehensions)

3) I did think about using sets instead, but I couldn't figure out to iteratively convert the lists of arrayofLists (in the general case) to sets. Is there a simple syntax for doing that?

Thank you so much!

Emm Gee
  • 165
  • 7

2 Answers2

2

You can use itertools.combinations to generate all possible combinations of length 2

In [232]: from itertools import combinations

In [233]: list(combinations(txArray, 2))
Out[233]: [(['1'], ['2']), (['1'], ['2', '3']), (['2'], ['2', '3'])]

You can then turn each pair of lists into a set and perform an intersection on them giving you a list of sets

In [234]: intersections = [set(a).intersection(set(b)) for a, b in combinations(txArray, 2)]

In [235]: intersections
Out[235]: [set(), set(), {'2'}]

Finally, you can perform a union on the collection of sets to unpack all sets from the list

In [236]: set.union(*intersections)
Out[236]: {'2'}

Also, note that it is faster and more readable to unpack the combinations ([set(a).intersection(set(b)) for a, b in combinations(txArray, 2)]) than it is to access by index ([set(c[0]).intersection(set(c[1])) for c in combinations(txArray, 2)])

aydow
  • 3,673
  • 2
  • 23
  • 40
  • This was extremely helpful, thank you! While this worked, I chose the answer @tif provided, because it was a bit shorter and cleaner and more what I was looking for by 'pythonic'. Thank you so much! – Emm Gee Aug 06 '18 at 00:38
  • The answers are exactly the same. Mine just has an explanation – aydow Aug 06 '18 at 00:47
1

A "more pythonic" solution:

import itertools
txArray=[[u'1'],[u'2'],[u'2', u'3']]
# generate all possible pairs from txArray, and intersect them 
ix=[set(p[0]).intersection(p[1]) for p in itertools.combinations(txArray,2)]
# calculate the union of the list of sets
set.union(*ix)
tif
  • 1,424
  • 10
  • 14