generating conditional product of lists in python (combinatorics)

Question

I want to be able to generate a conditional product. So similar to this answer: All combinations of a list of lists

I wanted to use itertools.product(*listOfLists). However, my problem is that the inclusion of one element from one list means other lists must be consulted for the product.

Example:

colors = ['red', 'blue', 'green']
fruits = ['apple', 'orange', 'banana']
locations = ['indoors', 'outdoors']

indoor_choices = ['bathroom', 'bedroom', 'kitchen']
green_choices = ['forest', 'light', 'dark']

Here, we want to consider always every possible choice of color, fuit, and location. However, in the case of 'indoor', we also want to consider indoor_choices, and in the case of 'green' being in a possible choice, we also want to choose a more specific color of green. It's kind of a tree of possibilities where some branches keep branching and others do not.

So in this silly example above you could do a for loop like so:

for c in colors:
    for f in fruits:
        for l in locations:
            # etc

but then we encounter the problem of what happens when two different categories have possible branching based on this choice.

A simple (hacky) solution would be just to manually code conditions and put for loops inside of them:

for c in colors:
    for f in fruits:
        for l in locations:

            if c == 'green' and l == 'indoor':
                for gc in green_choices:
                     for ic in indoor_choices:
                         # output

            elif c == 'green':
                for gc in green_choices:
                    # output

            elif l == 'indoor':
                for gc in green_choices:
                    # output

            else:
                # output

but imagine the horror when there are N lists where M of them have additional branching. Or even worse, there is nested additional branching ... basically this hack doesn't scale.

Any ideas? This problem has proved itself to be deceptively hard!

First, you should make your code reproducible (`colors = ["red", "blue", "green"]` rather than `colors = [red, blue, green]`). Also, it sounds like you are using the wrong data structure for this problem. How is `indoors` linked to `indoor_choices`? Shouldn't it instead be a nested list? — David Robinson, Mar 01 '13 at 03:17
done. and no, not a nested list. it's still important to know that in a particular selection of that we chose 'green' and 'indoors', which then forced us to make the additional combinations of the choices for specific types of indoor and specific types of green. do you understand the problem as I have stated it? perhaps I wasn't the clearest... — lollercoaster, Mar 01 '13 at 03:24
If not a nested list, than it least `indoor_choices` and `green_choices` should be in a dictionary, indexed by the strings `indoor` and `green`. That would prevent you from having to have special cases that use different variables. — David Robinson, Mar 01 '13 at 03:26
not sure what you are suggesting? I want to reiterate this is a silly example, and that the concept is that I have a N lists. Of these N, a subset of them are always chosen from A = (colors, fruits, locations). Another disjoint subset are conditional, and those choices are only considered when particular elements of A are included. I realize that a dictionary is perhaps a nicer way to visualize the code, but my only goal is enumeration of these condition combinations. or do you have a suggestion for how dicts would enable such enumeration? — lollercoaster, Mar 01 '13 at 03:30

score 6 · Accepted Answer · edited Mar 01 '13 at 04:34

6

Here's how I'd do it, with a recursive generator.

def prod(terms, expansions):
    if not terms: # base case
        yield ()
        return

    t = terms[0] # take the first term

    for v in expansions[t]: # expand the term, to get values
        if v not in expansions: # can the value can be expanded?
            gen = prod(terms[1:], expansions) # if not, we do a basic recursion
        else:
            gen = prod(terms[1:] + [v], expansions) # if so, we add it to terms

        for p in gen: # now we get iterate over the results of the recursive call
            yield (v,) + p # and add our value to the start

Here's how you call it to generate the product you wanted in your example:

expansions = {
        'colors':['red', 'blue', 'green'],
        'fruits':['apple', 'orange', 'banana'],
        'locations':['indoors', 'outdoors'],
        'indoors':['bathroom', 'bedroom', 'kitchen'],
        'green':['forest', 'light', 'dark']
    }

terms = ["colors", "locations"] # fruits omitted, to reduce the number of lines

for p in prod(terms, expansions):
    print(p)

Output:

('red', 'indoors', 'bathroom')
('red', 'indoors', 'bedroom')
('red', 'indoors', 'kitchen')
('red', 'outdoors')
('blue', 'indoors', 'bathroom')
('blue', 'indoors', 'bedroom')
('blue', 'indoors', 'kitchen')
('blue', 'outdoors')
('green', 'indoors', 'forest', 'bathroom')
('green', 'indoors', 'forest', 'bedroom')
('green', 'indoors', 'forest', 'kitchen')
('green', 'indoors', 'light', 'bathroom')
('green', 'indoors', 'light', 'bedroom')
('green', 'indoors', 'light', 'kitchen')
('green', 'indoors', 'dark', 'bathroom')
('green', 'indoors', 'dark', 'bedroom')
('green', 'indoors', 'dark', 'kitchen')
('green', 'outdoors', 'forest')
('green', 'outdoors', 'light')
('green', 'outdoors', 'dark')

edited Mar 01 '13 at 04:34

minopret

4,726
21
34

answered Mar 01 '13 at 04:12

Blckknght

100,903
11
120
169

1

Nice. Make sure you don't give it any mutually expanding strings though :-) – minopret Mar 01 '13 at 04:33
1

Since it's a generator, an infinitely recursive set of expansions might not be problematic. You'd just need to use `itertools.islice` or something, to limit how much of it you consume. – Blckknght Mar 01 '13 at 04:35
@Blckknght: If you have an infinitely recursive set, it could break because the `prod`s will call each other before yielding a single value. – nneonneo Mar 01 '13 at 04:49
Anyway, it would be easy to check automatically, for example, that the dict of lists will list each key no more than once. – minopret Mar 01 '13 at 04:56
would this work for nested conditional choices? as in if we had a dictionary inside of another? – lollercoaster Mar 01 '13 at 05:05
@nneonneo: You're right, if the first value from the generator was infinitely long, you'd never get a result (or rather, you'd get an exception when the recursion limit was hit). However, if the cycle is caused by a value at the end of the list, you would get an infinite number of finite-length values yielded first. For instance, if your expansion dict was `{"spam":["eggs", "sausage", "spam"]}` and you requested just `"spam"` in the list of terms, you'd get `("eggs",), ("sausage",), ("spam", "eggs"), ("spam", "sausage"), ("spam", "spam", "eggs")` and then the vikings would start singing. – Blckknght Mar 01 '13 at 05:09
Yep (hence the "could"). In some sense, that's a feature, as you point out :) – nneonneo Mar 01 '13 at 05:10
@lollercoaster: No, you shouldn't nest dictionaries within dictionaries. It expects the `expansions` parameter to be a dict mapping between strings (aka "terms") and lists of strings ("values", which may also be terms themselves, if they appear as keys elsewhere). – Blckknght Mar 01 '13 at 05:11

minopret · Answer 2 · 2013-03-01T04:11:11.397

1

If your real problem is really just like your example, then you can analyze the combinations into just four products:

is_green = ['green']
not_green = ['red', 'blue']
is_indoors = ['indoors']
not_indoors = ['outdoors']

p1 = itertools.product([not_green, fruits, not_indoors])
...
p2 = itertools.product([is_green, fruits, not_indoors, green_choices])
...
p3 = itertools.product([not_green, fruits, is_indoors, indoor_choices])
...
p4 = itertools.product([is_green, fruits, is_indoors, green_choices, indoor_choices])

That's all!

Now if we want to generalize so we don't have to make four "special" cases, we can capture the relation between certain values and the additional choices that they open up, as @DavidRobinson suggested.

import itertools

colors = ['red', 'blue', 'green']
fruits = ['apple', 'orange', 'banana']
locations = ['indoors', 'outdoors']

indoor_choices = ('bathroom', 'bedroom', 'kitchen')
green_choices = ('forest', 'light', 'dark')

choices = [colors, fruits, locations]
more_choices = { 'indoors': indoor_choices, 'green': green_choices }
for p in itertools.product(*choices):
    m = [more_choices[k] for k in p if k in more_choices]
    for r in itertools.product([p],*m):
        print list(r[0]) + list(r[1:])

Please note that there will inevitably be difficulties when choices and more_choices are large.

edited Mar 01 '13 at 04:11

answered Mar 01 '13 at 03:23

minopret

4,726
21
34

sure. but that is the same as the code I have written - 4 cases within the for loops. what happens when I have N lists, with M more lists that are conditional on the inclusion of certain elements from lists in the N? the number of cases blows up. forgive me if I have misunderstood something very clever you have done. – lollercoaster Mar 01 '13 at 03:26
1

You were right about my first example, so I added a generalization according to @DavidRobinson's suggestion. (I'm sorry, I don't know whether I've linked his ID correctly.) – minopret Mar 01 '13 at 03:51
awesome! totally works. good thinking. also, quick question: I changed the indoor_choices and green_choices to lists instead of tuples, the performance went way down (much slower). why would that be? – lollercoaster Mar 01 '13 at 04:46
I'm not certain of the exact performance hit in this specific case. In general, unlike lists, tuples permit Python and its libraries to optimize on the assumption that the contents and size never change. – minopret Mar 01 '13 at 04:49
also - would this work for nested conditional choices? as in if we had a dictionary inside of another? – lollercoaster Mar 01 '13 at 05:05
No, but for that refinement you already have other answers, so I believe I've reached a point of diminishing returns :-) – minopret Mar 01 '13 at 05:12

score 1 · Answer 3 · answered Mar 01 '13 at 04:27

Here's a recursive implementation using yield. I think its not quite as neat as @Blckknght's solution, but it may be helpful.

colors = ["red","blue","green"]
fruits = ["apple","orange", "banana"]
locations = ["indoors","outdoors"]

green_subtypes = ["forest", "light", "dark"]
indoor_locations = ["bathroom","bedroom","kitchen"]

def gen(state):
  if len(state)==0:
    for c in colors:
       s = [c]
       for y in gen(s):
         yield y
  elif len(state)==1:
    for x in fruits:
      s = state + [x]
      for y in gen(s):
        yield y
  elif len(state)==2:
    for x in locations:
      s = state + [x]
      for y in gen(s):
        yield y
  else:
    # If we're green and we haven't looped through the green options already 
    # (the check is a bit dodgy and could do with being moved into a flag inside state)
    if state[0]=='green' and len(set(state).intersection(set(green_subtypes)))==0:
      for x in green_subtypes:
        s = state + [x]
        for y in gen(s):
          yield y
    # If we're indoors and we haven't looped through the indoor options already 
    # (the check is a bit dodgy and could do with being moved into a flag inside state)
    elif state[2]=='indoors' and len(set(state).intersection(set(indoor_locations)))==0:
      for x in indoor_locations:
        s = state + [x]
        for y in gen(s):
          yield y
    else:
      yield state

for x in gen([]):
  print(x)

score 1 · Answer 4 · answered Mar 01 '13 at 04:40

We can add the "extra" choices post-hoc (Python 3 syntax):

def choice_product(choices, *iterables):
    for v in itertools.product(*iterables):
        ks = set(v) & choices.keys()
        if ks:
            choice_iters = [choices[k] for k in ks]
            for p in choice_product(choices, *choice_iters):
                yield v + p
        else:
            yield v

This uses itertools.product for efficiency.

Define choices as

choices = {'indoors' : ['bathroom', 'bedroom', 'kitchen'],
           'green': ['forest', 'light', 'dark']}

This recurses:

>>> for i in choice_product({'c': 'de', 'e': 'fg'}, 'ab', 'cd'):
...     print(i)
... 
('a', 'c', 'd')
('a', 'c', 'e', 'f')
('a', 'c', 'e', 'g')
('a', 'd')
('b', 'c', 'd')
('b', 'c', 'e', 'f')
('b', 'c', 'e', 'g')
('b', 'd')

generating conditional product of lists in python (combinatorics)

4 Answers4