7

Suppose that in Python I have 3 lists: a, b, c of variable lengths. For example :

a=[1,2,3]
b=[4,5,6]
c=[7,8]

I would like to get every unique combination of TWO elements of the 3 lists above, i. e.

[1,4],[1,5],[1,6],[1,7],[1,8],[2,4],[2,5]... and NOT unique combinations of the 3 lists (such as [1,4,7],[1,4,8],...).

I have looked at the solution here using itertools that is perfectly fine for 2 lists ; however, this solution does not work anymore when including a nth list because the unique combinations are of length n.

Here is what I have tried:

import itertools

a=[1,2,3]
b=[4,5,6]
c=[7,8]

d=list(itertools.product(a,b,c))

[(1, 4, 7), (1, 4, 8), (1, 5, 7), (1, 5, 8), (1, 6, 7), (1, 6, 8), (2, 4, 7), (2, 4, 8), (2, 5, 7), (2, 5, 8), (2, 6, 7), (2, 6, 8), (3, 4, 7), (3, 4, 8), (3, 5, 7), (3, 5, 8), (3, 6, 7), (3, 6, 8)]

Note: Above is just an example and the solution should work for n lists of variable length and with possibly the same value being in different lists... Any idea of how I could do would be greatly appreciated! :)


EDIT: as asked by @SirParselot, the elements have to come from different lists

Community
  • 1
  • 1
tlorin
  • 1,100
  • 6
  • 17
  • 30
  • 2
    And what have you tried? – arodriguezdonaire Feb 08 '16 at 13:27
  • 2
    do the combinations have to be between different lists? – SirParselot Feb 08 '16 at 13:41
  • FWIW, my answer works for the original question, with any number of input lists. But what exactly do you mean by "redundant elements"? – PM 2Ring Feb 08 '16 at 13:58
  • @PM2Ring edit done! It seems indeed that your solution is working fine as long as I don't have the same value appearing twice (or more) in different lists :) – tlorin Feb 08 '16 at 14:03
  • How do you want to handle that situation? It's easy enough to kill an output tuple if both of its elements happen to be identical. Preventing duplicates from appearing in the output is a little harder, since it requires keeping track of what's been produced. – PM 2Ring Feb 08 '16 at 14:05
  • @PM2Ring in such case would `g=(p for t in it.combinations(nl,2) for p in it.product(*t)); output=set(g)` be enough to remove excess pairs? (I have reduced your `pairs` function to a generator) – Pynchia Feb 08 '16 at 14:10
  • @Pynchia I am not sure to get what you are saying (I'm sorry, I'm kind of new to Python!); however, this gives `NameError: name 'it' is not defined` :) – tlorin Feb 08 '16 at 14:18
  • @Pynchia: My `pairs` function is already a generator, thanks to its use of `yield` :) You've made it into a generator expression. But yes, using a set on the output is the cleanest way to do to get rid of duplicates. That kind of undermines the benefit of using a generator, but I guess it's ok if you're not producing a huge number of output tuples.. – PM 2Ring Feb 08 '16 at 14:19
  • 1
    @tlorin: Pynchia is doing `import itertools as it` at the top of the script. I'll add some code to my answer. – PM 2Ring Feb 08 '16 at 14:21
  • @PM2Ring yes, sorry, the word `expression` got stuck in my mind and did not make it through the keyboard. I agree, the set would prevent any benefit of using a generator. I love your solution BTW. – Pynchia Feb 08 '16 at 14:26
  • Perfect! Thanks a lot to all of you :D – tlorin Feb 08 '16 at 14:27
  • Thanks, @Pynchia! It doesn't totally prevent any benefit of using a generator. It's still better than creating a set from a list: it saves the RAM that would be used (temporarily) for the list, and (as I mention in my edit) any duplicates found get rejected as soon as they're generated. – PM 2Ring Feb 08 '16 at 14:33

2 Answers2

10

You want the Cartesian product of each pair of lists in (a, b, c), so first you need itertools.combinations to generate the pairs of lists, and then itertools.product to create the desired output tuples.

from itertools import combinations, product

def pairs(*lists):
    for t in combinations(lists, 2):
        for pair in product(*t):
            yield pair

a = [1, 2, 3]
b = [4, 5, 6]
c = [7, 8]

for pair in pairs(a, b, c):
    print(pair)

output

(1, 4)
(1, 5)
(1, 6)
(2, 4)
(2, 5)
(2, 6)
(3, 4)
(3, 5)
(3, 6)
(1, 7)
(1, 8)
(2, 7)
(2, 8)
(3, 7)
(3, 8)
(4, 7)
(4, 8)
(5, 7)
(5, 8)
(6, 7)
(6, 8)

Here's a new version that handles repeated elements. It does not return a tuple if the two items in the tuple equal each other, and it also eliminates duplicated tuples in the output by feeding the output from pairs into a set. This is reasonably efficient since pairs is a generator, so duplicates are eliminated as they are found.

from itertools import combinations, product

def pairs(*lists):
    for t in combinations(lists, 2):
        for pair in product(*t):
            #Don't output pairs containing duplicated elements 
            if pair[0] != pair[1]:
                yield pair

a = [1, 2, 3]
b = [3, 4, 5]
c = [5, 6]

#Feed the output of `pairs` into a set to eliminate duplicate tuples
output = set(pairs(a, b, c))
for pair in sorted(output):
    print(pair)

output

(1, 3)
(1, 4)
(1, 5)
(1, 6)
(2, 3)
(2, 4)
(2, 5)
(2, 6)
(3, 4)
(3, 5)
(3, 6)
(4, 5)
(4, 6)
(5, 6)
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • sorry to bother you again: what if I have a list of lists of an UNKNOWN size `n` (in your solution, size is 3 and we know it is 3 from the beginning)? If I have for instance a list of lists (with `n` lists) as a dictionary value, I would like to do something like `output = set(pairs(dict[value]))` but this seems not to work... For instance: `dict = {'value': [[2,1],[3,3]]};output = set(pairs(dict['value']));output` gives: `set([])`. I thought that this is related to your solution so I ask it here as a comment but I can make a new post if you prefer :) – tlorin Feb 09 '16 at 10:25
  • 1
    My `pairs` function will cope with _any_ number of input lists (or tuples), eg you can do `output = set(pairs(a,b,c,d,e))`, where `a`, `b`, etc, are simple lists or tuples. If you want to pass it a list of lists (or a tuple of tuples, etc) you need to use the `*` "splat" operator, which converts a list (or tuple) into separate arguments. Eg: `mylists=[[2,1],[3,3]];output = set(pairs(*mylists))`. Or using your example, `output = set(pairs(*dct['value']))`. (I changed the name of your dict to `dct`, because it's bad practice to use `dict`, `list`, `str`, etc as variable names). – PM 2Ring Feb 09 '16 at 10:55
  • I was missing the `*`! Very handy, I did not know this one! Well, thanks a lot again :) – tlorin Feb 09 '16 at 11:08
  • 1
    Not a problem! Splat is very handy. There's also double splat `**` that does a similar thing for dictionaries. See http://stackoverflow.com/q/36901/4014959 – PM 2Ring Feb 09 '16 at 11:20
0

I guess what you should do is use your working solution for two lists to do it with n lists. Basically, you could transform your input into a list of lists, and do:

for index, left_list in enumerate(list_of_lists):
        other_lists = (list_of_lists[0:index] + list_of_lists[index+1:])
        right_list = [i for sublist in other_lists for i in sublist]
        print(list(itertools.product(left_list, right_list)))
DainDwarf
  • 1,651
  • 10
  • 19