I have been trying to use the python multiprocessing package to speed up some physics simulations I'm doing by taking advantage of the multiple cores of my computer.
I noticed that when I run my simulation at most 3 of the 12 cores are used. In fact, when I start the simulation it initially uses 3 of the cores, and then after a while it goes to 1 core. Sometimes only one or two cores are used from the start. I have not been able to figure out why (I basically change nothing, except closing a few terminal windows (without any active processes)). (The OS is Red Hat Enterprise Linux 6.0, Python version is 2.6.5.)
I experimented by varying the number of chunks (between 2 and 120) into which the work is split (i.e. the number of processes that are created), but this seems to have no effect.
I looked for info about this problem online and read through most of the related questions on this site (e.g. one, two) but could not find a solution.
(Edit: I just tried running the code under Windows 7 and it's using all available cores alright. I still want to fix this for the RHEL, though.)
Here's my code (with the physics left out):
from multiprocessing import Queue, Process, current_process
def f(q,start,end): #a dummy function to be passed as target to Process
q.put(mc_sim(start,end))
def mc_sim(start,end): #this is where the 'physics' is
p=current_process()
print "starting", p.name, p.pid
sum_=0
for i in xrange(start,end):
sum_+=i
print "exiting", p.name, p.pid
return sum_
def main():
NP=0 #number of processes
total_steps=10**8
chunk=total_steps/10
start=0
queue=Queue()
subprocesses=[]
while start<total_steps:
p=Process(target=f,args=(queue,start,start+chunk))
NP+=1
print 'delegated %s:%s to subprocess %s' % (start, start+chunk, NP)
p.start()
start+=chunk
subprocesses.append(p)
total=0
for i in xrange(NP):
total+=queue.get()
print "total is", total
#two lines for consistency check:
# alt_total=mc_sim(0,total_steps)
# print "alternative total is", alt_total
while subprocesses:
subprocesses.pop().join()
if __name__=='__main__':
main()
(In fact the code is based on Alex Martelli's answer here.)
Edit 2: eventually the problem resolved itself without me understanding how. I did not change the code nor am I aware of having changed anything related to the OS. In spite of that, now all cores are used when I run the code. Perhaps the problem will reappear later on, but for now I choose to not investigate further, as it works. Thanks to everyone for the help.