Unable to kill zombie processes created by multiprocessing with pool.join()

Question

The task is to parallel information retrieval for clusters. get_on_mc_retrievals retrieves some information given a mention candidate (mc), under the hood it invokes pairwise_distances from sklearn which supports multiple CPUs. Even so, the resources were still very under-used and then I've implement this snippet of code.

My implementation:

def dev_get_sentence_retrieval(sent_mcs, num_retrieval):

    func = partial(MentionToRetrieval.get_one_mc_retrievals, num_retrieval=num_retrieval)

    pool = mp.Pool(mp.cpu_count())
    pool.daemon = True
    jobs = [pool.map_async(func, sent_mcs[i]) for i in range(len(sent_mcs))]
    res = [j.get() for j in jobs]
    pool.close()
    pool.join()

    return res

Problem

# test 1
for i in range(5):
    res = dev_get_sentence_retrieval(conll_mcs[i], 20)
    print(res)

# test 2
for i in range(5, 10):
    res = dev_get_sentence_retrieval(conll_mcs[i], 20)
    print(res)

Test1 can be executed smoothly, but executing test2 the program hangs forever. A couple of SO posts point pool.join() to be the solution, it kills zombie processes, but it seems not working for my case. Did I miss anything in the code? Please feel free to make any suggestions if there are better ways to process conll_mcs in parallel.

Note:

# structure of `conll_mcs`

conll_mcs
|--- sentence_mcs
    |--- word_position_mcs
        |--- mc

I know this may not be helpful... but why set `pool.daemon = True`? Check out [this question](https://stackoverflow.com/questions/6974695/python-process-pool-non-daemonic) which may contain some hints. — sytech, Jan 17 '18 at 02:02
Suggested by this post https://stackoverflow.com/questions/20977573/python-calling-multiprocessing-pool-inside-a-daemon, but it indeed made no difference for my case, tested without `pool.daemon=True`. — GabrielChu, Jan 17 '18 at 02:05

Unable to kill zombie processes created by multiprocessing with pool.join()

0 Answers0