6

I have the following code snippet which i needed to (massively) speed up. As is, it's hugely inefficient.

possible_combos.append([comb for comb in
    itertools.combinations_with_replacement(input_list, target_number_of_packages)
        if np.sum([j[1] for j in comb])==target_number_of_packages])

Disassembled:

possible_combos 

is the output

input_list

is a list of tuples in the form ([...],number_of_packages)

target_number_of_packages

is the number of packages I need to reach. I can combine as many elements of the list "input_list" as I want (repetitions are allowed), but need to reach exactly target_number_of_packages when adding their "number_of_packages", else the combination is not valid.

I was thinking about something like

possible_combos.append([comb for comb in
    itertools.combinations_with_replacement(input_list, lambda x:x=...)))

but wasn't able to fill the blank.

My question is, is it at all possible to use lambda this way? I don't need an answer for how to deal with this specific usecase because I've solved it differently (with itertools product an a recursive generator function), but I'd still like to know, would there have been a solution?

In short: Is it possible to use lambda to change a value inside another function on the fly?

Minimal example for the problem:

input_list=[1,2,3,4,5,6] #in minmal form

target_number_of_packages=4

possible_combos should be [[1,1,1,1],[2,1,1],[2,2],[3,1],[4]]

And I'm looking for something roughly equivalent to, but faster than,

possible_combos=[comb for comb in
    itertools.combinations_with_replacement(input_list) if np.sum(comb)==target_number_of_packages]

Just with the np.sum(comb)==target put in the itertools.combinations_with_replacement - if possible at all.

I've changed the question because I solved the underlying problem differently, but part of it is still something I'd like to know about. As there where no answers, I think an edit is apropriate.

Community
  • 1
  • 1
DonQuiKong
  • 413
  • 4
  • 15
  • 2
    The question isn't clear to me, can you provide a minimal, reproducible example with sample inputs and desired outputs? – Chris_Rands Oct 05 '18 at 13:21
  • 2
    This might be an X-Y problem but I can't tell yet – Chris_Rands Oct 05 '18 at 13:22
  • A lambda, like any function, can cause a value to be *mutated* when it is called. But no function in Python has the power to *rebind* a name in another function's scope. – Daniel Pryden Oct 05 '18 at 13:26
  • Not really. Lambda function is just an anonymous function. See: https://stackoverflow.com/questions/16501/what-is-a-lambda-function – teroi Oct 05 '18 at 13:26
  • @Chris_Rands I've tried. – DonQuiKong Oct 05 '18 at 13:30
  • 1
    If I read the code correctly, it looks like you're trying to solve the [Knapsack problem](https://en.wikipedia.org/wiki/Knapsack_problem). Is that correct? If so, it is known to be an NP-hard problem and an approximate solution is the best you can get in polynomial time. – Daniel Pryden Oct 05 '18 at 13:34
  • It's not quite clear what you're trying to do, but yes you can use a lambda to introduce state by using a closure. – Tasos Papastylianou Oct 05 '18 at 13:34
  • @DanielPryden I'm okay with brute-forcing it, but I'd like to do that in a smart way, e.g. no going down every branch of the possibilities tree if I already know that branch is over the limit. I wrote a recursive function for my underlying problem and it's pretty fast now, but I was wondering if a lambda would have helped for that specific line of code when I was at it. – DonQuiKong Oct 05 '18 at 13:37
  • 2
    @DonQuiKong: A lambda function, or any other function object, might have helped in building a state-holding mechanism and/or a lazy evaluation mechanism. But the underlying approach you're describing is called [Dynamic Programming (DP)](https://enwikipedia.org/wiki/Dynamic_programming), and to the best of my knowledge none of the combinatorical functions in the `itertools` module has built-in support for DP. – Daniel Pryden Oct 05 '18 at 13:43
  • `lambda` is just **syntax**, allowing you to create a function object in an expression (`def functionname(..): ...` is a *statement*, and you can't use statements inside an expression). So a lambda just creates a function object, and is nothing special. You can't use functions to alter the local namespace of another function at runtime, no. You are asking the wrong question here, in my opinion. – Martijn Pieters Oct 07 '18 at 10:57
  • @MartijnPieters but isn't that what I'm doing for example when using list.sort(key=lambda x:x[0]) ? Changing the sort parameter to depend on the input? I'm still having trouble grasping why that's different. – DonQuiKong Oct 07 '18 at 11:00
  • 1
    @DonQuiKong: that doesn't change values in another function, on the fly or otherwise. The `list.sort()` function explicitly supports calling a function you pass in, it is documented a such. Python's sort implementation uses the function to [affect a Swartzian transform](https://en.wikipedia.org/wiki/Schwartzian_transform) using that function. – Martijn Pieters Oct 07 '18 at 11:32
  • @DonQuiKong: and then the answer is: no, a callback function doesn't help improve this specific situation. – Martijn Pieters Oct 07 '18 at 11:34
  • @MartijnPieters oh I see, so it only works with sort because the sort implementation calls the function if given one. Thanks, that solved my question. Do you want to make it into an answer? – DonQuiKong Oct 07 '18 at 11:40

3 Answers3

4

lambda is just syntax, allowing you to create a function object in an expression (def functionname(..): ... is a statement, and you can't use statements inside an expression). So a lambda just creates a function object, and is nothing special. You can't use functions to alter the local namespace of another function at runtime, no.

From comments it is appears you are asking about how you could use a callback to perhaps solve your problem, but I don't think you fully understand your problem space yet nor how things like list.sort(key=...) work. Python's sort implementation explicitly calls the key callback, by choice, the function called is passed information and the sort function alters behaviour based on what is returned; the key function doesn't get to choose what happens with the returned value.

You are asking the wrong question here, in my opinion.

The problem you are trying to solve a subset of the Knapsack Problem; you have the unbounded variant, as I can combine as many elements of the list "input_list" as I want (repetitions are allowed).

Trying to use itertools to solve it is the wrong approach; itertools functions will generate to many incorrect combinations. You are essentially generating all combinations with repetition (multisets) for output sizes 1 through target size, so you can calculate the number of such multisets for each given size, and summing them:

def multiset_size(n, k):
    return int(factorial(k + n - 1) / (factorial(k) * factorial(n - 1)))

generated = sum(multiset_size(len(input_list), i + 1) for i in range(target_number_of_packages))

For your toy example, with 6 inputs and a target size of 4, you are generating 209 different combinations, but there are only 5 viable combinations. That's a whopping 40.8 combinations wasted per viable output! With larger input sizes this ratio is only going to get (much) worse.

The full knapsack problem is commonly solved using a dynamic programming approach. The programming chrestomathy site Rossettacode.org has a great example in Python on how to use DP for an unbounded knapsack solution. Basically, you keep a table of 'sacks' for every level of capacity (from zero to max) and keep that table updated as you see if adding the current item to sacks that have space for the item produces a better (more valuable) sack than the best combination for that sack, so far.

But when producing all possible combinations, you are better of with your own iterative or recursive approach. There is no ready-made itertools function you can use here, nor would using a callback function make this any easier.

Taking a leaf out of the DP solution, an iterative solution would use a series piles of sacks for each possible capacity total, and copy these up to the next pile as you add more such items to each sack that has space for them:

from itertools import repeat

def unbounded_knapsack_combos(input_list, target):
    # collect piles of filled sacks, here each pile contains
    # all sacks of the same capacity.
    # A sacks consist of counts for each item in input_list, so
    # sack[0] counts how many input_list[0] items were used.
    # piles start empty, except for pile 0 (the empty sack pile)
    piles = [[] for _ in range(0, target)]
    piles[0] = [[0] * len(input_list)]

    # process from smallest to biggest so we can avoid some work, like
    # adding an item of size target - 1 on a pile that will never combine
    # with larger items
    size_idx = [(s, i) for i, (_, s) in enumerate(input_list) if s <= target]
    for size, i in sorted(size_idx):
        for s in range(size, target + 1 - size):
            for sack in piles[s - size]:
                new_sack = sack[:]
                new_sack[i] += 1
                piles[s].append(new_sack)

        # Yield knapsacks that can be completed
        for sack in piles[target - size]:
            new_sack = sack[:]
            new_sack[i] += 1
            # turn the item counts in each sack in the target pile back into
            # items from the input list
            yield [item for i, c in enumerate(new_sack) if c
                   for item in repeat(input_list[i], c)]

        # remove piles that we no longer need; any pile that won't
        # fit larger items (multiple items can have the same size)
        del piles[target - size + 1:]

This solution happens to use itertools.repeat(), but only because it is convenient to produce the final output from the counters in the sacks.

For your toy example, using the same (value, size) format typical of the knapsack problem, this produces:

>>> input_list = [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6)]
>>> target_size = 4
>>> solution = unbounded_knapsack_combos(input_list, target_size)
>>> [[s for _, s in sack] for sack in solution]
[[1, 1, 1, 1], [1, 1, 2], [2, 2], [1, 3], [4]]

This only ever generates the actual viable combinations, and nothing more.

The function differs from the full solution in that here all possible combinations are kept, rather than only the one combination with the best value for the items combined. And because we are generating all combinations, there is no need to sort the items by their value ratio, as the linked DP approach does (the sorting helps avoid copying too many less optimal solutions in the loop).

The recursive version schwobaseggl produced essentially does the same thing; create sacks for a given capacity, and the recursive calls add more items for the next larger capacity. Mine just happens to be iterative so you won't run into Python's recursion limits, and it yields results as it finds them (like itertools would). The recursive version also is forced to repeatedly concatenate lists (a O(N^2) operation for N == recursion depth), so the iterative approach is a lot faster.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
1

Generating all possible combinations and filtering out the ones with the matching sum does way too much work. But you can write your own function that generates exactly and only the ones you need:

def combos(lst, total, max=None):
    if total < 0:
        return
    elif total == 0:
        yield []
    for x in lst:
        if max is None or x <= max:
            for com in combos(lst, total - x, x):
                yield [x] + com

>>> list(combos([1, 2, 3, 4, 5, 6], 4))
[[1, 1, 1, 1], [2, 1, 1], [2, 2], [3, 1], [4]]
user2390182
  • 72,016
  • 6
  • 67
  • 89
  • That's pretty much what I did in the end. But I am still wondering if I could have solved it with itertools and a lambda or sth. similar. – DonQuiKong Oct 05 '18 at 13:54
  • This actually produces too many combinations, when sizes are not unique. The real input can easily have sizes repeated (as items are `(value, size)` tuples, items with a different value can still have the same size). For example, for `combos([1, 1, 2], 2)` you produce *4* `[1, 1]` pairs, where you have `[first 1, first 1]`, `[first 1, second 1]`, `[second 1, first 1]` and `[second 1, second 1]`. The combinations `[first 1, second 1]` and `[second 1, first 1]` are the exact same solution in a knapsack solution (it doesn't matter what order you pick items). – Martijn Pieters Oct 07 '18 at 21:37
  • I was wondering why the recursive approach was much, *much* slower than I expected it to be. – Martijn Pieters Oct 07 '18 at 21:39
  • @MartijnPieters good point. I'm packing boxes though so in my case order does matter. So this answer has it's merits for many real cases. – DonQuiKong Oct 08 '18 at 06:29
  • @MartijnPieters Having duplicate elements in the input seems inconsistent with allowing repeated usage of the same element. It can also easily be remedied by removing dupes first. – user2390182 Oct 08 '18 at 06:40
  • @schwobaseggl they are not duplicates. In knapsack problems items have two properties: value and a size or weight. For the unbounded variant the *combination* must be unique, but both value and size are individually do not need to be. Here we looking at just the sizes, not optimising for most value. – Martijn Pieters Oct 08 '18 at 06:44
  • @schwobaseggl and you can’t just remove duplicates, either. Generating all possible ways to fill the knapsack includes using combinations of the distinct items that happen to have the same size. – Martijn Pieters Oct 08 '18 at 06:46
  • @DonQuiKong why does order matter when packing boxes? You have a single size dimension here, packing the item of size 3 before the item of size 1 is no different from packing them in the reversed order. – Martijn Pieters Oct 08 '18 at 06:47
  • @MartijnPieters I left the other parts out of the example because they don't matter, but actually what I'm doing here is adding every possible way of packing e.g. 7 boxes of size (x,y,z) in any order ((z,y,x),(y,z,x),...) and any combination (6 stacked in x direction, one in y or all 7 in y or 3in z and another 3 in z but next to them in y + 1 in any other direction, etc.). So the combinations of 1 and 1 are actually not the same ones. It would be for example 1 oriented in xy and 1 in xz. (So what you called size isn't size but number of boxes in my example and those boxes have an orientation) – DonQuiKong Oct 08 '18 at 06:58
  • @DonQuiKong sure, but what you have posted *here*, with your description as it stands, we have a unbounded knapsack problem. :-) – Martijn Pieters Oct 08 '18 at 07:43
  • @MartijnPieters yes yes, but the answer is still helpful and might be for others coming here. – DonQuiKong Oct 08 '18 at 08:20
  • @DonQuiKong: I never said otherwise, and you may note that this answer has a positive score (and zero downvotes). But for anyone coming here looking for similar output, they need to understand that this answer and mine are going to produce different results the moment they use knapsack input with repeated sizes. This answer will produce more results than mine; for `[1, 1, 2]` this produces one more result, for larger input sets the differences are going to compound. – Martijn Pieters Oct 08 '18 at 08:45
  • @DonQuiKong: moreover, your inefficient approach using `combinations_with_replacement` (same as what vash_the_stampede used), will, after filtering, produce the same number of results as my approach does, because for `combinations_with_replacement` order also doesn't matter. So in that respect schwobaseggl's answer diverges quite sharply from what you stated is a correct but inefficient approach. – Martijn Pieters Oct 08 '18 at 08:52
0

Using itertools.combinations_with_replacement produces a similar list

from itertools import combinations_with_replacement

input_list = [1,2,3,4,5,6] 

l = [list(combination_with_replacement(input_list, i)) for i in range(5)]
res = [list(filter(lambda x: sum(x) == 4, i)) for i in l]
# [[], [(4,)], [(1, 3), (2, 2)], [(1, 1, 2)], [(1, 1, 1, 1)]]
vash_the_stampede
  • 4,590
  • 1
  • 8
  • 20
  • That's almost what the first line of code in the question does, I was hoping to get around creating the whole list by giving combinations_... a lambda. – DonQuiKong Oct 07 '18 at 07:34
  • Oops sorry then will remove @donquikong – vash_the_stampede Oct 07 '18 at 07:42
  • your solution is better than what I had (I had missed the for i in range(5) when asking) so it might still help someone coming here. It's just not what I'm looking for. – DonQuiKong Oct 07 '18 at 10:34
  • This still produces a large number of combinations that are then again discarded. There is *no need* to generate that many combinations. You generated ***210*** combinations to filter down to 5 viable solutions. – Martijn Pieters Oct 07 '18 at 12:40
  • @MartijnPieters correct, should post be taken down? – vash_the_stampede Oct 07 '18 at 14:38
  • That’s up to you; you posted an answer to the question, that it is an inefficient answer doesn’t make it subject to deletion. – Martijn Pieters Oct 07 '18 at 17:18