I have an understanding problem/question concerning the multiprocessing library of Python:
Why do different processes started (almost) simultaneously at least seem to execute serially instead of parallely?
The task is to control a universe of a large number of particles (a particle being a set of x/y/z coordinates and a mass) and perform various analyses on them while taking advantage of a multi-processor environment. Specifically for the example shown below I want to calculate the center of the mass of all particles.
Because the task specifically says to use multiple processors I didn't use the thread library as there is this GIL-thingy in place that constrains the execution to one processor.
Here's my code:
from multiprocessing import Process, Lock, Array, Value
from random import random
import math
from time import time
def exercise2(noOfParticles, noOfProcs):
startingTime = time()
particles = []
processes = []
centerCoords = Array('d',[0,0,0])
totalMass = Value('d',0)
lock = Lock()
#create all particles
for i in range(noOfParticles):
p = Particle()
particles.append(p)
for i in range(noOfProcs):
#determine the number of particles every process needs to analyse
particlesPerProcess = math.ceil(noOfParticles / noOfProcs)
#create noOfProcs Processes, each with a different set of particles
p = Process(target=processBatch, args=(
particles[i*particlesPerProcess:(i+1)*particlesPerProcess],
centerCoords, #handle to shared memory
totalMass, #handle to shared memory
lock, #handle to lock
'batch'+str(i)), #also pass name of process for easier logging
name='batch'+str(i))
processes.append(p)
print('created proc:',i)
#start all processes
for p in processes:
p.start() #here, the program waits for the started process to terminate. why?
#wait for all processes to finish
for p in processes:
p.join()
#normalize the coordinates
centerCoords[0] /= totalMass.value
centerCoords[1] /= totalMass.value
centerCoords[2] /= totalMass.value
print(centerCoords[:])
print('total time used', time() - startingTime, ' seconds')
class Particle():
"""a particle is a very simple physical object, having a set of x/y/z coordinates and a mass.
All values are randomly set at initialization of the object"""
def __init__(self):
self.x = random() * 1000
self.y = random() * 1000
self.z = random() * 1000
self.m = random() * 10
def printProperties(self):
attrs = vars(self)
print ('\n'.join("%s: %s" % item for item in attrs.items()))
def processBatch(particles,centerCoords,totalMass,lock,name):
"""calculates the mass-weighted sum of all coordinates of all particles as well as the sum of all masses.
Writes the results into the shared memory centerCoords and totalMass, using lock"""
print(name,' started')
mass = 0
centerX = 0
centerY = 0
centerZ = 0
for p in particles:
centerX += p.m*p.x
centerY += p.m*p.y
centerZ += p.m*p.z
mass += p.m
with lock:
centerCoords[0] += centerX
centerCoords[1] += centerY
centerCoords[2] += centerZ
totalMass.value += mass
print(name,' ended')
if __name__ == '__main__':
exercise2(2**16,6)
Now I'd expect all processes to start at about the same time and parallelly execute. But when I look at the output of the programm, this looks as if the processes were executing serially:
created proc: 0
created proc: 1
created proc: 2
created proc: 3
created proc: 4
created proc: 5
batch0 started
batch0 ended
batch1 started
batch1 ended
batch2 started
batch2 ended
batch3 started
batch3 ended
batch4 started
batch4 ended
batch5 started
batch5 ended
[499.72234074100135, 497.26586187539453, 498.9208784328791]
total time used 4.7220001220703125 seconds
Also when stepping through the programm using the Eclipse-debugger, I can see how the program always waits for one process to terminate before starting the next one at the line marked with a comment ending in 'why?'. Of course, this might just be the debugger, but when I look at the output produced in a normal run, this shows exactly the above picture.
- Are those processes executing parallelly and I just can't see it due to some sharing problem of stdout?
- If the processes are executing serially: why? And how can I make them run in parallel?
Any help on understanding this is greatly appreciated.
I executed the above code from PyDev and from command line using Python 3.2.3 on a Windows 7 Machine with a dual core Intel processor.
Edit:
Due to the output of the program I misunderstood the problem: The processes are actually running in parallel, but the overhead of pickling large amounts of data and sending it to the subprocesses takes so long that it completely distorts the picture.
Moving the creation of the particles (i.e. the data) to the subprocesses so that they don't have to be pickled in the first place removed all problems and resulted in a useful, parallel execution of the program.
To solve the task, I will therefore have to keep the particles in shared memory so they don't have to be passed to subprocesses.