How to interleave two iterators where x% of the samples come from one iterator, and (1-x)% come from the other

Question

Say that there are two iterators:

def genA():
    while True:
        yield 1

def genB():
    while True:
        yield 2

gA = genA()
gB = genB()

According to this SO answer they can be evenly interleaved using the itertools recipes:

def cycle(iterable):
    # cycle('ABCD') --> A B C D A B C D A B C D ...
    saved = []
    for element in iterable:
        yield element
        saved.append(element)
    while saved:
        for element in saved:
              yield element

def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Recipe credited to George Sakkis
    num_active = len(iterables)
    nexts = cycle(iter(it).__next__ for it in iterables)
    while num_active:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            # Remove the iterator we just exhausted from the cycle.
            num_active -= 1
            nexts = cycle(islice(nexts, num_active))

aa = roundrobin(gA, gB)
next(aa)

So next(aa) will shift the iterator output each time, so a bunch of next calls will result in 1, 2, 1, 2, 1, 2, 1 - 50% will come from one iterator, and the other 50% will come from the other.

I am wondering how we can code it so that x% will come from one iterator, and (1-x)% from the other. For example, 75% from the first iterator, and 25% from the other.

So several calls to next(combinedIterator) will result in something like this:

1 1 1 2 1 1 1 2 1 1 1 2

For my purpose, it doesn't matter if the output is strictly ordered like above, or if it is random, with the output determined by probability.

Tomerikoo · Accepted Answer · 2020-11-10T10:08:32.497

If you're okay with a deterministic approach (as I understand from your self-answer), you can add an argument which is the percentage of the first iterator and then just calculate each iterator's "part". For example, if you want .75 from the first iterator - this translates to: for every three elements from iterator1, yield one element from iterator2.

def interleave(itt1, itt2, itt1_per):
    itt1_frac, total = itt1_per.as_integer_ratio()
    itt2_frac = total - itt1_frac
    while True:
        for _ in range(itt1_frac):
            yield next(itt1)
        for _ in range(itt2_frac):
            yield next(itt2)

newGen = interleave(gA, gB, .75)

for _ in range(12):
    print(next(newGen), end=' ')

This will print:

1 1 1 2 1 1 1 2 1 1 1 2

Watch out! This will only work well for "nice" fractions. For example: using this function with .6 means that for every 5,404,319,552,844,595 elements from iterator1, it will yield 3,602,879,701,896,397 elements from iterator2.

One way to overcome this is to use decimal.Decimal with string arguments:

from decimal import Decimal

def interleave(itt1, itt2, itt1_per):
    itt1_frac, total = Decimal(str(itt1_per)).as_integer_ratio()
    ...

Using Decimal now means that passing .6 translates to the more sensible: for every three elements from iterator1, yield two elements from iterator2.

Using this revised code with .6 as an argument, will print:

1 1 1 2 2 1 1 1 2 2 1 1

score 0 · Answer 2 · answered Nov 10 '20 at 09:25

def genA():
    while True:
        yield 1

def genB():
    while True:
        yield 2

gA = genA()
gB = genB()

import random

def xyz(itt1, itt2):
    while True:
        if random.random() < .25:
            yield next(itt1)
        else:
            yield next(itt2)

newGen = xyz(gA, gB)

next(newGen)

This works for a uniform distribution. I won't select this as an answer for someone to possibility give a non probabilistic answer.

How to interleave two iterators where x% of the samples come from one iterator, and (1-x)% come from the other

2 Answers2