I'm new to Python multiprocessing, and I'm trying to implement some parallel calculations. I've got the info that this:
#M is an integer, contains the number of processes I'd like to launch.
results = []
for i in range(0, M):
p = Process(target=processchild, args=(data[i],q))
p.start()
result.append(q.get())
p.join()
is still sequential, because .join()
causes the loop to wait until p
is finished before starting the next one. I've read here in an answer, that
You'll either want to join your processes individually outside of your for loop (e.g., by storing them in a list and then iterating over it)...
So if I'd modify my code to
results = []
for i in range(0, M):
processes[i] = Process(target=processchild, args=(data[i],q))
processes[i].start()
result.append(q.get())
for i in range(0, M):
processes[i].join()
Would it actually run in parallel now? If not, how can I modify my code to work that way? I've read the solution using numpy.Pool
and apply_async
posted as an answer to the question I previously linked, so I'm mostly interested in a solution that doesn't use these.