I've got some code that is using a Pool
from Python's multiprocessing
module. Performance isn't what I expect and wanted to profile the code to figure out what's happening. The problem I'm having is that the profiling output gets overwritten for each job and I can't accumulate a sensible amount of stats.
For example, with:
import multiprocessing as mp
import cProfile
import time
import random
def work(i):
x = random.random()
time.sleep(x)
return (i,x)
def work_(args):
out = [None]
cProfile.runctx('out[0] = work(args)', globals(), locals(),
'profile-%s.out' % mp.current_process().name)
return out[0]
pool = mp.Pool(10)
for i in pool.imap_unordered(work_, range(100)):
print(i)
I only get stats on the "last" job, which may not be the most computationally demanding one. I presume I need to store the stats somewhere and then only write them out when the pool is being cleaned up.