11

I'm creating a list with itertools from a list of ranges, so far I have this:

start_list = [xrange(0,201,1),xrange(0,201,2),xrange(0,201,5),xrange(0,201,10),xrange(0,201,20),xrange(0,201,50),xrange(0,201,100),xrange(0,201,200)]

Now, I know that if I were to try to run this next line it will kill my python interpreter:

next_list = list(itertools.product(*start_list))

What I'm wondering is would it be possible to put in an argument that checks each tuple, for a sum of its items and only puts them in next_list if equal to a certain amount?

Maybe something like:

next_list = list(itertools.product(*start_list,sum(tuples)=200))

I know this isn't right and I might need to start to re-think the way I'm going about this. Will start_list's ranges in the generator be too many to get through to build another list?

tijko
  • 7,599
  • 11
  • 44
  • 64
  • 4
    If you're attempting to figure out how to partition the integer 200 into 8 terms drawn from different sets, there are easier ways to compute next_list. If I'm counting right your Cartesian product has 5768123130 distinct items to be iterated over, which will take quite a while. – DSM Jun 12 '12 at 01:47
  • Hi DSM, thank you for responding. I'll be looking into creating a more efficient method. – tijko Jun 12 '12 at 01:50
  • related: http://stackoverflow.com/questions/1106929/find-all-combinations-of-coins-when-given-some-dollar-value – jfs Jun 12 '12 at 02:27
  • J.F. Sebastian, thanks I had seen links for discussions of this and other similar problems(i had read the wiki on knapsack). I didn't want to view others solutions before seeing if I could get an efficient one myself. Now, the code I ran was nowhere close to the minute rule and was ever so brute'ish, I got the correct result, I'm still hesitant to look before I find an optimal way to go about this. – tijko Jun 12 '12 at 02:55
  • @tijko: sorry I'm late to the party; please consider my answer! – Hugh Bothwell Jun 12 '12 at 23:41

3 Answers3

20

Better to just use a list comprehension

new_list = [item for item in itertools.product(*start_list) if sum(item) == 200]
Delimitry
  • 2,987
  • 4
  • 30
  • 39
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • Hey gnibbler, thanks for responding. I had tried this `for i in list(itertools.product(*start_list)): if sum(i) == 200: new_list.append(i)` Which also crashed the interpreter. Now even though I need to find a different way to work out this problem, I learned from your comment thanks! – tijko Jun 12 '12 at 02:38
2
Solution      Runtime           Fn calls           Lines of Code
--------   ----------   ------------------------   -------------
gnibbler   2942.627 s   1473155845 (1.5 billion)          1
me_A         16.639 s     28770812 ( 29 million)          5
me_B          0.452 s       774005 ( .8 million)         43

Solution me_A:

import itertools

def good_combos(basis, addto):
    good_sums = set(addto-b for b in basis[0])
    return ([addto-sum(items)]+list(items) for items in itertools.product(*basis[1:]) if sum(items) in good_sums)

next_list = list(good_combos(start_list, 200))

Do note that this can be much faster if you pass it the longest list first.

This solution replaces one level of iteration with a set lookup; with your longest list containing 200 items, it should not be surprising that this is almost 200 times faster.


Solution me_B:

import itertools
from bisect import bisect_left, bisect_right

def good_combos(addto=0, *args):
    """
    Generate all combinations of items from a list of lists,
    taking one item from each list, such that sum(items) == addto.

    Items will only be used if they are in 0..addto

    For speed, try to arrange the lists in ascending order by length.
    """
    if len(args) == 0:                          # no lists passed?
        return []

    args = [sorted(set(arg)) for arg in args]   # remove duplicate items and sort lists in ascending order
    args = do_min_max(args, addto)              # use minmax checking to further cull lists

    if any(len(arg)==0 for arg in args):        # at least one list no longer has any valid items?
        return []

    lastarg = set(args[-1])
    return gen_good_combos(args, lastarg, addto)

def do_min_max(args, addto, no_negatives=True):
    """
    Given
      args          a list of sorted lists of integers
      addto         target value to be found as the sum of one item from each list
      no_negatives  if True, restrict values to 0..addto

    Successively apply min/max analysis to prune the possible values in each list

    Returns the reduced lists
    """
    minsum = sum(arg[0] for arg in args)
    maxsum = sum(arg[-1] for arg in args)

    dirty = True
    while dirty:
        dirty = False
        for i,arg in enumerate(args):
            # find lowest allowable value for this arg
            minval = addto - maxsum + arg[-1]
            if no_negatives and minval < 0: minval = 0
            oldmin = arg[0]
            # find highest allowable value for this arg
            maxval = addto - minsum + arg[0]
            if no_negatives and maxval > addto: maxval = addto
            oldmax = arg[-1]

            if minval > oldmin or maxval < oldmax:
                # prune the arg
                args[i] = arg = arg[bisect_left(arg,minval):bisect_right(arg,maxval)]
                minsum += arg[0] - oldmin
                maxsum += arg[-1] - oldmax
                dirty = True
    return args

def gen_good_combos(args, lastarg, addto):
    if len(args) > 1:
        vals,args = args[0],args[1:]
        minval = addto - sum(arg[-1] for arg in args)
        maxval = addto - sum(arg[0] for arg in args)
        for v in vals[bisect_left(vals,minval):bisect_right(vals,maxval)]:
            for post in gen_good_combos(args, lastarg, addto-v):
                yield [v]+post
    else:
        if addto in lastarg:
            yield [addto]

basis = reversed(start_list)  # more efficient to put longer params at end
new_list = list(good_combos(200, *basis))

do_min_max() really can't accomplish anything on your data set - each list contains both 0 and addto, depriving it of any leverage - however on more general basis data it can greatly reduce the problem size.

The savings here are found in successively reducing the number of items considered at each level of iteration (tree pruning).

Hugh Bothwell
  • 55,315
  • 8
  • 84
  • 99
  • If your code took 50 mins to write, I'd still win :). Seriously my answer was just address the original problem, not the 1 minute rule – John La Rooy Jun 13 '12 at 00:47
  • 1
    @gnibbler: more like 3 hours of playing around with the long version before the short version occurred to me. – Hugh Bothwell Jun 13 '12 at 01:01
1

Use this:

next_list = list(item for item in itertools.product(*start_list) if sum(item) == 200)

Somnath Muluk
  • 55,015
  • 38
  • 216
  • 226
Joel Cornett
  • 24,192
  • 9
  • 66
  • 88