I generate data using a generator (this data is memory intensive, although it is not the case in this dummy example) and then I have to make some calculations over that data. Since these calculations take much longer than data generation, I wish to do them in parallel. Here is the code I wrote (with dummy functions for simplicity):
from math import sqrt
from multiprocessing import Pool
def fibonacci(number_iter):
i = 0
j = 1
for round in range(number_iter):
yield i
k = i + j
i, j = j, k
def factors(n):
f = set()
for i in range(1, n+1, 1):
if n % i == 0:
f.add(i)
return f
if __name__ == "__main__":
pool = Pool()
results = pool.map(factors, fibonacci(45))
I learnt from other questions (see here and here) that map
consumes the iterator fully. I wish to avoid that because that consumes a prohibitive amount of memory (that is why I am using a generator in the first place!).
How can I do this by lazily iterating over my generator function? The answers in the questions mentioned before have not been of help.