4

I'm trying to get the product of 2 infinite generators but the product function in itertools doesn't allow this sort of behaviour.

Example behaviour:

from itertools import *
i = count(1)
j = count(1)
x = product(i, j)

[Killed]

What I want:

x = product(i, j)

((0,0), (0,1), (1,0), (1,1) ...)

It doesn't matter in which order the combinations get returned as long as given infinite time, all combinations will be eventually generated. This means that given a combination of elements, there must be a finite index in the returned generator with that combination.

muddyfish
  • 3,530
  • 30
  • 37
  • You probably can be intersted in `coconut-lang`. Look [here](http://coconut.readthedocs.io/en/master/HELP.html#case-study-4-vector-field) for an example that is similar to what you want. – Ilya V. Schurov Dec 12 '16 at 11:28

4 Answers4

9

tl;dr

The code presented below is now included in the package infinite on PyPI. So now you can actually just pip install infinite before running this:

from itertools import count
from infinite import product

for x, y in product(count(0), count(0)):
    print(x, y)
    if (x, y) == (3, 3):
        break

The lazy solution

If you don't care about order, since the generators are infinite, a valid output would be:

(a0, b1), (a0, b2), (a0, b3), ... (a0, bn), ...

So you can just take the first element from the first generator and then loop over the second.

If you really want to do it, you need a nested loop, and you need to duplicate the nested generator with tee, otherwise you will not be able to loop over it a second time (even if it doesn't matter because you will never exhaust the generator, so you will never need to loop over).

But if you really really really want to do it, here you have it:

from itertools import tee

def product(gen1, gen2):
    for elem1 in gen1:
        gen2, gen2_copy = tee(gen2)
        for elem2 in gen2_copy:
            yield (elem1, elem2)

The idea is to always make a single copy of gen2. Try it with finite generators first.

print(list(product(range(3), range(3,5))))
[(0, 3), (0, 4), (1, 3), (1, 4), (2, 3), (2, 4)]

Then with infinite ones:

print(next(product(count(1), count(1))))
(1, 1)

The zig-zag algorithm

As noted by others in comments (and as stated in the previous solution), this will not produce all the combinations. Nevertheless the mapping between natural numbers and pairs of numbers exists, so it must be possible to iterate the pairs in a different way, so that looking for a specific pair (without infinite numbers) can be done in finite time, you need the zig-zag scanning algorithm.

zig-zag scanning algorithm

In order to do it you need to cache previous values, so I created a class GenCacher to cache previously extracted values:

class GenCacher:
    def __init__(self, generator):
        self._g = generator
        self._cache = []

    def __getitem__(self, idx):
        while len(self._cache) <= idx:
            self._cache.append(next(self._g))
        return self._cache[idx]

After that you just need to implement the zig-zag algorithm:

def product(gen1, gen2):
    gc1 = GenCacher(gen1)
    gc2 = GenCacher(gen2)
    idx1 = idx2 = 0
    moving_up = True

    while True:
        yield (gc1[idx1], gc2[idx2])

        if moving_up and idx1 == 0:
            idx2 += 1
            moving_up = False
        elif not moving_up and idx2 == 0:
            idx1 += 1
            moving_up = True
        elif moving_up:
            idx1, idx2 = idx1 - 1, idx2 + 1
        else:
            idx1, idx2 = idx1 + 1, idx2 - 1

Example

from itertools import count

for x, y in product(count(0), count(0)):
    print(x, y)
    if x == 2 and y == 2:
        break

This produces the following output:

0 0
0 1
1 0
2 0
1 1
0 2
0 3
1 2
2 1
3 0
4 0
3 1
2 2

Extend the solution to more than 2 generators

We can edit the solution a bit to make it work even for multiple generators. The basic idea is:

  1. iterate over the distance from (0,0) (the sum of the indexes). (0,0) is the only one with distance 0, (1,0) and (0,1) are at distance 1, etc.

  2. generate all the tuples of indexes for that distance

  3. extract the corresponding element

We still need the GenCacher class, but the code becomes:

def summations(sumTo, n=2):
    if n == 1:
        yield (sumTo,)
    else:
        for head in xrange(sumTo + 1):
            for tail in summations(sumTo - head, n - 1):
                yield (head,) + tail

def product(*gens):
    gens = map(GenCacher, gens)

    for dist in count(0):
        for idxs in summations(dist, len(gens)):
            yield tuple(gen[idx] for gen, idx in zip(gens, idxs))
Community
  • 1
  • 1
enrico.bacis
  • 30,497
  • 10
  • 86
  • 115
  • This doesn't work as not every combination will eventually be generated – muddyfish Dec 12 '16 at 11:00
  • 3
    In no case they will eventually be generated. You are dealing with infinite. You should specify the order, otherwise any solution is acceptable. I suggest you a zig-zag order – enrico.bacis Dec 12 '16 at 11:00
  • I tried that but that requires duplicating generators an infinite amount of times which `itertools.tee` doesn't appear to be able to do – muddyfish Dec 12 '16 at 11:02
  • @muddyfish I added the code that *eventually* lists them all. – enrico.bacis Dec 12 '16 at 11:14
  • Are you sure that will list them all? It looks like the second loop will never finish – muddyfish Dec 12 '16 at 11:23
  • @muddyfish of course it doesn't. That's what I tried to explain to you. They are infinite generators. You will not list them all. Ever. So if you don't care about the order, your question does not make sense. – enrico.bacis Dec 12 '16 at 11:24
  • @enrico.bacis, that's not correct. For zig-zag enumeration, for eny element in the product set there exists some natural number N such that this element is N'th in the sequence. In your example, there's no such N for most of the elements. – Ilya V. Schurov Dec 12 '16 at 11:26
  • @enrico.bacis: there is a way to do it that makes sure every given (elem1, elem2) has a finite time before it is generated. In your solution, that's not the case. – RemcoGerlich Dec 12 '16 at 11:26
  • I totally agree with both of you. But the OP just asked for: "*as long as given infinite time, all combinations will be eventually generated*". So my answer wanted to underline how poor the request was without specifying the order. – enrico.bacis Dec 12 '16 at 11:32
  • Well there are combinations (any that use an element of gen1 other than the first) that are never generated, not even after infinite time. – RemcoGerlich Dec 12 '16 at 11:36
  • @RemcoGerlish, It smells like [transfinite numbers](https://en.wikipedia.org/wiki/Transfinite_number) :) – Ilya V. Schurov Dec 12 '16 at 11:47
  • 1
    @all check the edit, I implemented the zig-zag algorithm and now it works as expected. – enrico.bacis Dec 12 '16 at 11:56
  • @muddyfish Now it should perform as you expected – enrico.bacis Dec 12 '16 at 12:04
  • @muddyfish if the solution satisfies your requirements consider marking it as accepted. Otherwise, please express your doubts, I find the problem quite interesting. – enrico.bacis Dec 12 '16 at 12:38
  • I normally wait at least a week before accepting an answer but this solution seems fine to me – muddyfish Dec 12 '16 at 13:03
  • @muddyfish fine, btw I also extended the solution to work with more than 2 generators. – enrico.bacis Dec 12 '16 at 13:19
  • @muddyfish Check the tldr ;) – enrico.bacis Dec 12 '16 at 15:37
  • Huh. Nice to know it exists – muddyfish Dec 12 '16 at 16:49
  • @muddyfish well, it didn't exist before your question, I just created it. – enrico.bacis Dec 12 '16 at 17:05
1
 def product(a, b):
     a, a_copy = itertools.tee(a, 2)
     b, b_copy = itertools.tee(b, 2)
     yield (next(a_copy), next(b_copy))
     size = 1
     while 1:
         next_a = next(a_copy)
         next_b = next(b_copy)
         a, new_a = itertools.tee(a, 2)
         b, new_b = itertools.tee(b, 2)
         yield from ((next(new_a), next_b) for i in range(size))
         yield from ((next_a, next(new_b)) for i in range(size))
         yield (next_a, next_b)
         size += 1

A homebrew solution with itertools.tee. This uses lots of memory as intermediate states are stored in tee

This effectively returns the sides of an ever expanding square:

0 1 4 9 
2 3 5 a
6 7 8 b
c d e f

Given infinite time and infinite memory, this implementation should return all possible products.

muddyfish
  • 3,530
  • 30
  • 37
0

No matter how you do it, memory will grow a lot, as every value from each iterator will occur an infinite number of times after the first time, so it has to be kept around in some growing variable.

So something like this may work:

def product(i, j):
    """Generate Cartesian product i x j; potentially uses a lot of memory."""
    earlier_values_i = []
    earlier_values_j = []

    # If either of these fails, that sequence is empty, and so is the
    # expected result. So it is correct that StopIteration is raised,
    # no need to do anything.
    next_i = next(i)
    next_j = next(j)
    found_i = found_j = True

    while True:
        if found_i and found_j:
            yield (next_i, next_j)
        elif not found_i and not found_j:
            break  # Both sequences empty

        if found_i: 
            for jj in earlier_values_j:
                yield (next_i, jj)
        if found_j:
            for ii in earlier_values_i:
                yield (ii, next_j)

        if found_i:
            earlier_values_i.append(next_i)
        if found_j:
            earlier_values_j.append(next_j)

        try:
            next_i = next(i)
            found_i = True
        except StopIteration:
            found_i = False

        try:
            next_j = next(j)
            found_j = True
        except StopIteration:
            found_j = False

This was so simple in my head but it looks horribly complicated after typing it out, there must be some simpler way. But I think it will work.

RemcoGerlich
  • 30,470
  • 6
  • 61
  • 79
-1

You could siply use a generator expresion:

from itertools import *
i = count(1)
j = count(1)

for e in ((x, y) for x in i for y in j):
    yield r

or in python3:

yield from ((x, y) for x in i for y in j)
Netwave
  • 40,134
  • 6
  • 50
  • 93