1

I am trying to parallelize a function inside a pool using the multiprocessing module, but I run into the error:

daemonic processes are not allowed to have children

More specifically, I am using emcee module which makes use of the multiprocessing module for parallelization, and I would like to parallelize my posterior function as well to speed up the calculations. Is there a way to parallelize a function inside the main Pool in this case?

Edit (code added):

# defines the log-posterior probability distribution
def logp(p):

    mu_imf, mu_h0, sigma_h, b_h = p
    
    logpost = np.zeros(ngal, dtype='longdouble')
    
    for i in range(ngal):
        logpost[i] = np.log(quad(lambda m_halo: Integrand(m_halo, i, mu_imf, mu_h0, sigma_h, b_h), 11.9, 15.0107)[0])
        
    return np.sum(logpost) 

with Pool() as pool:
    sampler = emcee.EnsembleSampler(nwalkers, npars, logp, pool=pool)

1 Answers1

2

In general you don't want to do that. Parallelising parallel processes can explode very quickly. It also is likely not much quicker, as there is an overhead in spinning up a new process.

Rather than trying to parallelise everything at once, parallelise in two steps. First generate all the emcee tasks you are interested in, and parallelise your processing, as in the docs. Then aggregate your results to a queue, and map your posterior function over that.

I.e., rather than having your current setup, parallelising this:

def do_work(**params):
    some_results = do_emcee_call_in_parallel(**params)
    do_stuff_with_some_results(some_results)

Where you've split your problem into two chunks, do this:

problems = get_all_problems()
results = parallel_solve_emcee_problems(problems)
pool = Pool()
pool.map(posterior, results) # etc

Alternatively, if I've misunderstood what you're trying to do, feel free to show what your desired inputs and outputs are.

2e0byo
  • 5,305
  • 1
  • 6
  • 26
  • @ 2e0byo Thank you very much for your help. I was trying to figure out how I can implement what you have described, but I am still not sure how to do that. The problem is that I have to pass the function **logp** to **emcee.EnsembleSampler**. The **logp** function however, is a computationally expensive function and contains an inevitable for loop and hence, I would like to parallelize it. I have now added the **logp** function as well as how `emcee` makes use of it to my question. – havij farangi Sep 21 '21 at 19:59