0

I have two nested loops, the outer one using an infinite generator, and the inner a finite iterator (a list of numbers):

for j in itertools.count():
   for q in Q:
       ...

Under a given condition, I need to break out of both loops, which makes the code inelegant (I have to set a flag to trigger the second break). To avoid that, I thought I'd combine the two loops into one using

for j, q in itertools.product(itertools.count(), Q):
    ...

But when I did that, my computer got slower and slower, apparently swapping memory, until I had to kill the Python process.

The documentation of product says

the actual implementation does not build up intermediate results in memory

so I assumed the result of product to be a generator, which creates elements of the product on demand. But the slowing down and swapping speaks against that.

What did I wrong? How can I achieve what I wanted, a single loop combining an infinite generator with a list?

A. Donda
  • 8,381
  • 2
  • 20
  • 49
  • Can you put the code in a function so that instead of having to set a flag to break out of two levels of loop, you simply return from the function? – alani Jul 18 '20 at 05:30
  • @alaniwi I could, but it would be really unnatural. These two loops are part of a larger algorithm that doesn't decompose into separate parts; lots of shared state. – A. Donda Jul 18 '20 at 05:32
  • 1
    Fair enough. Pity that python has no equivalent of perl's named loops - https://perldoc.perl.org/functions/last.html – alani Jul 18 '20 at 05:37
  • There is a very interesting solution at https://stackoverflow.com/questions/8419796/naming-loops-in-python involving raising `StopIteration` and catching it at the level that you want. (I guess that any exception will do, but in terms of readability, it is the obvious one to use.) – alani Jul 18 '20 at 05:42
  • @alaniwi, it is interesting, thanks! But Iains solution works well for now. – A. Donda Jul 18 '20 at 05:49

3 Answers3

1

You can create a generator using a comprehension. Comprehensions support iterating over multiple iterables at once using multiple fors

for j, q in ((j, q) for j in itertools.count() for q in Q):
    ...
Iain Shelvington
  • 31,030
  • 3
  • 31
  • 50
  • @user2357112supportsMonica what performance penalty would that be? – Iain Shelvington Jul 18 '20 at 05:46
  • Works great, thanks! I still wonder why `product` doesn't do the equivalent though. – A. Donda Jul 18 '20 at 05:46
  • @IainShelvington: Mostly the expense of suspending and resuming the generator over and over. Packing and unpacking tuples also has a cost, as does fiddling with `j` on iterations where you don't need to. – user2357112 Jul 18 '20 at 05:49
1

Your documentation quote comes from the following context:

This function is roughly equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:

def product(*args, repeat=1):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = [tuple(pool) for pool in args] * repeat
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

The intermediate results are all the result lists. The actual itertools.product implementation doesn't need to build those. It does, however, build something analogous to pools.

itertools.product needs to be able to iterate over its inputs repeatedly, so it materializes all inputs into tuples. This requirement doesn't apply to the first input when repeat is 1, but the implementation materializes all inputs anyway, perhaps for consistency or ease of implementation. itertools.count() cannot be fully materialized, hence the results you saw.

One option you have is a generator expression. The following loop:

for j, q in ((j, q) for j in itertools.count() for q in Q):

will behave like you wanted itertools.product to behave, though it will be slower than plain nested loops (because of the overhead of the generator). The performance penalty is just a constant factor, and it doesn't slow down the loop body, so it's rarely a problem.

If you can factor out the nested loops into a function, you can also use return instead of break:

def helper_function(stuff):
    for j in itertools.count():
        for q in Q:
            ...
            if whatever:
                return
            ...
user2357112
  • 260,549
  • 28
  • 431
  • 505
1

Adding in case it is useful to anyone, and based on https://stackoverflow.com/a/8420092/13596037, exceptions can be used to break or continue an outer loop, for example:

class ContinueOuter(Exception): pass
class BreakOuter(Exception): pass

try:
    for i in range(10):  # loop we want to break or continue from
        try:
            for j in range(3):
                print(i, j)
                if (i, j) == (3, 1):
                    raise ContinueOuter
                elif (i, j) == (4, 0):
                    raise BreakOuter
        except ContinueOuter:
            pass
except BreakOuter:
    pass
alani
  • 12,573
  • 2
  • 13
  • 23