1

Experienced developer learning Python.

I'm iterating through the combinations k at a time from a list of size n. I've been using

from itertools import chain, combinations
for subset in (combinations(range(n), k)) : 
    doSomethingWith(subset)

Now the problem is that most of the time my doSomethingWith()'s are not productive and I'd like to skip as many of them as I can. And if doSomthingWith() fails for a given subset, I can skip every subset whose rightmost coordinate is larger. In other words if it fails for (1,3,5,8) then the next subset I want to look at is (1,3,6,0), skipping (1,3,5,9), (1,3,5,10), etc.

I realized I need to get explicit control of the loop indices. I need a variable number of nested for loops, using recursion or iteration. Before coding that up I Googled around and found this thread which looks promising.

So now I have:

from itertools import product, repeat
set  = range(n)
sets = repeat(set, k)

for subset in list(product(*sets)) :
    doSomethingWith(subset)

Beautifully Pythonic but I still have the same problem. I have no way to tell product() when to break out of the innermost loop. What I really want is to be able to pass a callback function into product() so it can execute and optionally break out of the innermost loop.

Any Pythonic suggestions? I'd hate to have to explicitly code variable nested loops or iterate through the return from product() and examine the subsets by hand. That seems so old school :-)

Community
  • 1
  • 1
user4894
  • 111
  • 2
  • 2
    Don't do `for subset in list(product(*sets)):`... leave out `list`, that forces you to materialize the whole `product` into memory and avoids the memory efficiency of lazily iterating over `product`. Since you are not keeping it around, it's just pointlessly inefficient. – juanpa.arrivillaga Feb 13 '17 at 06:19
  • If you use itertools things like `product` or `combinations`, they're always going to generate all the combinations. You can't make it do an internal `break`. *You* can skip combinations by storing the "failed" cases, checking the last element, and using `continue` to move to the next one, but you'll still be getting every element. – BrenBarn Feb 13 '17 at 06:21
  • I still don't understand how using `product` helps at all. – juanpa.arrivillaga Feb 13 '17 at 06:30
  • @juanpa.arrivillaga Yes thanks that makes sense (about not using list()). Also I see I can't do what I want. Every solution involves generating all the subsets and keeping track of them myself. In which case explicitly coding nested loops at least saves all those iterations. – user4894 Feb 13 '17 at 06:30
  • 1
    It might be possible to write a custom generator that could be signalled by `send`-ing in a value, but it could be kind of awkward to use. – BrenBarn Feb 13 '17 at 06:31

2 Answers2

1

This is an intriguing problem. I concocted a generator that can be used to achieve something like what you want. This is more of a prototype than a full solution. It may need to be tweaked to be really useful, and there may be edge cases I haven't considered. (For one thing, right now it won't work properly on arbitrary iterables that might be exhaused, only on re-iterables like lists.)

class SkipUp(Exception):
    def __init__(self, numSkips):
        self.numSkips = numSkips
        super(SkipUp, self).__init__(numSkips)

def breakableProduct(*sets):
    if not sets:
        yield []
        return
    first, rest = sets[0], sets[1:]
    for item in first:
        subProd = breakableProduct(*rest)
        for items in subProd:
            try:
                yield [item] + items
            except SkipUp as e:
                if e.numSkips == 0:
                    yield None
                    break
                else:
                    e.numSkips -= 1
                    yield subProd.throw(e)

You can use breakableProduct more or less like the normal itertools.product:

>>> prod = breakableProduct([1, 2, 3], [11, 22, 33], [111, 222, 333])
... for x, y, z in prod:
...     print(x, y, z)
1 11 111
1 11 222
1 11 333
1 22 111
1 22 222
# ...etc...
3 33 222
3 33 333

However, after getting a value from it, you can if you wish use .throw to pass in a SkipUp exception whose argument is the index of the set whose next element you want to skip to. That is, if you want to skip over all elements of the third set and jump to the next element of the second set, use SkipUp(1) (1 is the second set because indexing is 0-based):

>>> prod = breakableProduct([1, 2, 3], [11, 22, 33], [111, 222, 333])
... for x, y, z in prod:
...     print(x, y, z)
...     if z == 222:
...         prod.throw(SkipUp(1))
1 11 111
1 11 222
1 22 111
1 22 222
1 33 111
1 33 222
2 11 111
2 11 222
2 22 111
2 22 222
2 33 111
2 33 222
3 11 111
3 11 222
3 22 111
3 22 222
3 33 111
3 33 222

See how this aborts the innermost iteration after 222, instead jumping forward to the next iteration of the middle generator. If you want to jump out all the way to the outermost iteration:

>>> prod = breakableProduct([1, 2, 3], [11, 22, 33], [111, 222, 333])
... for x, y, z in prod:
...     print(x, y, z)
...     if z == 222:
...         prod.throw(SkipUp(0))
1 11 111
1 11 222
2 11 111
2 11 222
3 11 111
3 11 222

So SkipUp(0) jumps out to the next "tick" of the first iterator (i.e., the list [1, 2, 3]). Throwing in SkipUp(2) would have no effect, since it would just mean "skip to the next iteration of the innermost iterator", which is what the regular iteration would do anyway.

The advantage of this over a solution using something like product or combinations from itertools is that you can't stop those generators from generating every combination. So even if you want to skip over some elements, itertools will still do all the looping to generate all the ones you don't want, and you'll only be discarding them after they're already generated. This breakableProduct, on the other hand, actually exits the inner loops early if you tell it to, so it will completely skip unnecessary iterations.

BrenBarn
  • 242,874
  • 37
  • 412
  • 384
1

Here's a fairly basic iterative ordered combination maker that takes a callback function. It's not as versatile as BrenBarn's solution, but it does skip generating products as specified in the question: if the callback fails when passed arg seq, returning False (or something false-ish), then combo will skip generating those other subsequences that commence with seq[:-1].

def combo(n, k, callback):
    a = list(range(k))
    ok = callback(a)

    k -= 1
    while True:
        i = k
        if not ok: i -= 1
        while True:
            a[i] += 1
            if a[i] == (n + i - k):
                i -= 1
                if i < 0: 
                    return
            else:
                break 
        v = a[i] + 1  
        a[i+1:] = range(v, v + k - i) 
        ok = callback(a)

# test

n = 8
k = 4

def do_something(seq):
    s = sum(seq)
    ok = s <= seq[-2] * 3
    print(seq, s, ok)
    return ok

combo(n, k, do_something)

output

[0, 1, 2, 3] 6 True
[0, 1, 2, 4] 7 False
[0, 1, 3, 4] 8 True
[0, 1, 3, 5] 9 True
[0, 1, 3, 6] 10 False
[0, 1, 4, 5] 10 True
[0, 1, 4, 6] 11 True
[0, 1, 4, 7] 12 True
[0, 1, 5, 6] 12 True
[0, 1, 5, 7] 13 True
[0, 1, 6, 7] 14 True
[0, 2, 3, 4] 9 True
[0, 2, 3, 5] 10 False
[0, 2, 4, 5] 11 True
[0, 2, 4, 6] 12 True
[0, 2, 4, 7] 13 False
[0, 2, 5, 6] 13 True
[0, 2, 5, 7] 14 True
[0, 2, 6, 7] 15 True
[0, 3, 4, 5] 12 True
[0, 3, 4, 6] 13 False
[0, 3, 5, 6] 14 True
[0, 3, 5, 7] 15 True
[0, 3, 6, 7] 16 True
[0, 4, 5, 6] 15 True
[0, 4, 5, 7] 16 False
[0, 4, 6, 7] 17 True
[0, 5, 6, 7] 18 True
[1, 2, 3, 4] 10 False
[1, 2, 4, 5] 12 True
[1, 2, 4, 6] 13 False
[1, 2, 5, 6] 14 True
[1, 2, 5, 7] 15 True
[1, 2, 6, 7] 16 True
[1, 3, 4, 5] 13 False
[1, 3, 5, 6] 15 True
[1, 3, 5, 7] 16 False
[1, 3, 6, 7] 17 True
[1, 4, 5, 6] 16 False
[1, 4, 6, 7] 18 True
[1, 5, 6, 7] 19 False
[2, 3, 4, 5] 14 False
[2, 3, 5, 6] 16 False
[2, 3, 6, 7] 18 True
[2, 4, 5, 6] 17 False
[2, 4, 6, 7] 19 False
[2, 5, 6, 7] 20 False
[3, 4, 5, 6] 18 False
[3, 4, 6, 7] 20 False
[3, 5, 6, 7] 21 False
[4, 5, 6, 7] 22 False

If you comment out the if not ok: i -= 1 line it will produce all combinations.


This code can be easily modified to do larger skips. Instead of using a boolean return from the callback we get it to return the skip level wanted. If it returns zero then we don't do any skipping. If it returns 1, then we skip the subsequences that commence with seq[:-1] as in the version above. If the callback returns 2 then we skip the subsequences that commence with seq[:-2], etc.

def combo(n, k, callback):
    a = list(range(k))
    skips = callback(a)

    k -= 1
    while True:
        i = k - skips
        while True:
            a[i] += 1
            if a[i] == (n + i - k):
                i -= 1
                if i < 0: 
                    return
            else:
                break 
        v = a[i] + 1  
        a[i+1:] = range(v, v + k - i) 
        skips = callback(a)
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182