How to parallelize a for loop that each iteration is a parallelized job

Question

I am wondering if anyone has experience to parallelize a for loop within which each iteration is a parallelized function using the multiprocessing library in python. I was trying to parallelize the loop using multiprocessing library again but it seems like the library does not allow a child worker to create a multiprocessing job. Shall I consider other parallel libraries in python? Below is the code for each iteration. The train dataframe will change from iteration to iteration. Thanks.

params = {}
pool = Pool(64)
print('---start params calculation--')
train_parallel = train[parallel_cols].copy()
start = time.time()
params = pool.starmap(proc.cal_params, zip(repeat(train_parallel), repeat(policy_segs), repeat(day_to_cum), daily_limits))  # process data_inputs iterable with pool
print(time.time() - start)
params = {key:value for element in params for key, value in element.items()}
print('---end params calculation--')
pool.close()

Each process in a multiprocessing pool is a *daemon* process and daemon processes cannot create new processes. This is a duplicate of another post I have seen somewhere. I think there is a solution with `multiprocessing.Pool` that overrides the default creation of processes with the daemon flag set. See [Python Process Pool non-daemonic?](https://stackoverflow.com/questions/6974695/python-process-pool-non-daemonic). — Booboo, Nov 08 '21 at 17:10
Does this answer your question? [Python Process Pool non-daemonic?](https://stackoverflow.com/questions/6974695/python-process-pool-non-daemonic) — Booboo, Nov 08 '21 at 17:17

How to parallelize a for loop that each iteration is a parallelized job

0 Answers0