1

I have multiple generators which need some time for initialization before each of the generators can yield the first result. In my code example below is an example. Each genearator needs 5 seconds for initialization. So the total time is 10s. Is there a way to initialize g1 and g2 parallel? So that the total initialization time is only 5s?

from random import random
from time import sleep


def my_generator():
    sleep(5)
    for i in range(5):
        yield random()

# this is what I want to do in parallel
g1 = my_generator()
g2 = my_generator()

x = [(r1, r2) for r1, r2 in zip(g1, g2)]
martineau
  • 119,623
  • 25
  • 170
  • 301
McDizzy
  • 189
  • 11
  • 3
    What's really happening during that five seconds? Are you waiting for an external event or a URL? Is your code dealing with a complex calculation? Are the generators completely stand-alone, or do the values they generate depend on other ongoing state? All this can affect the answer. – Frank Yellin Oct 23 '20 at 20:34
  • @FrankYellin The generators are independent. The main part for the long initialization time are database requests because of big data. The calculations are minor complex – McDizzy Oct 23 '20 at 20:56
  • Sounds like [this](https://stackoverflow.com/a/52376841/9059420) might work for you then. – Darkonaut Oct 23 '20 at 21:20
  • @Darkonaut thank you very much but actually thats not exact what I need. The Problem is that I have to preserve the generator's yield orde.r I have to process the response of `g1` and `g2` at the same time. I think if the example above can be executed in parallel my problem is solved – McDizzy Oct 23 '20 at 21:48
  • I'm not sure if I really understand what you mean, but you can try with [this](https://stackoverflow.com/a/64296722/9059420) for truly parallel. Just comment out `results.sort()` and calling with your example should be `list(parallel_gen(gen_func=my_generator, gen_args_tuples=[()] * 2)])`. But you'll pay a price in overhead for IPC during the yielding. – Darkonaut Oct 23 '20 at 22:06

1 Answers1

1

I found a soulution with async:

import asyncio
from random import random
from aiostream import stream

async def my_generator():
    await asyncio.sleep(5)
    for i in range(5):
        yield random()

async def main():
    combine = stream.zip(my_generator(), my_generator())
    x = []
    async with combine.stream() as streamer:
        async for item in streamer:
            x.append(item)
    print(x)

asyncio.run(main())
McDizzy
  • 189
  • 11