Python multiprocessing slower than single thread

Question

I have been playing around with multiprocessing problem and notice my algorithm is slower when I parallelizes it than when it is single thread.

In my code I don't share memory. And I'm pretty sure my algorithm (see code), which is just nested loops is CPU bound.

However, no matter what I do. The parallel code runs 10-20% slower on all my computers.

I also ran this on a 20 CPUs virtual machine and single thread beats multithread every times (even slower up there than my computer, actually).

from multiprocessing.dummy import Pool as ThreadPool
from multi import chunks
from random import random
import logging
import time
from multi import chunks

## Product two set of stuff we can iterate over
S = []
for x in range(100000):
  S.append({'value': x*random()})
H =[]
for x in range(255):
  H.append({'value': x*random()})

# the function for each thread
# just nested iteration
def doStuff(HH):
  R =[]
  for k in HH['S']:
    for h in HH['H']:
      R.append(k['value'] * h['value'])
  return R

# we will split the work
# between the worker thread and give it
# 5 item each to iterate over the big list
HChunks = chunks(H, 5)
XChunks = []

# turn them into dictionary, so i can pass in both
# S and H list
# Note: I do this because I'm not sure if I use the global
# S, will it spend too much time on cache synchronizatio or not
# the idea is that I dont want each thread to share anything.
for x in HChunks:
  XChunks.append({'H': x, 'S': S})

print("Process")
t0 = time.time()
pool = ThreadPool(4)
R = pool.map(doStuff, XChunks)
pool.close()
pool.join()

t1 = time.time()

# measured time for 4 threads is slower 
# than when i have this code just do 
# doStuff(..) in non-parallel way
# Why!?

total = t1-t0
print("Took", total, "secs")

There are many related question opened, but many are geared toward code being structured incorrectly - each worker being IO bound and such.

Possible duplicate of [multiprocessing.dummy in Python](http://stackoverflow.com/questions/26432411/multiprocessing-dummy-in-python) — MisterMiyagi, Jul 06 '16 at 06:46

MisterMiyagi · Accepted Answer · 2016-07-06T06:49:36.317

You are using multithreading, not multiprocessing. While many languages allow threads to run in parallel, python does not. A thread is just a separate state of control, i.e. it holds it own stack, current function, etc. The python interpreter just switches between executing each stack every now and then.

Basically, all threads are running on a single core. They will only speed up your program when you are not CPU bound.

multiprocessing.dummy replicates the API of multiprocessing but is no more than a wrapper around the threading module.

Multithreading is usually slower than single threading if you are CPU bound. This is because the work and processing resources stay the same, but you add overhead for managing the threads, e.g. switching between them.

How to fix this: instead of using from multiprocessing.dummy import Pool as ThreadPool do multiprocessing.Pool as ThreadPool.

You might want to read up on the GIL, the Global Interpreter Lock. It's what prevents threads from running in parallel (that and implications on single threaded performance). Python interpreters other than CPython may not have the GIL and be able to run multithreaded on several cores.

@40Plot No, Python is just, well, strange and antique in its approach — , Jul 06 '16 at 06:51

Python multiprocessing slower than single thread

1 Answers1