Multithreading module slower than multiprocessing in python

Question

I've found the threading module takes more time than multiprocessing for the same task.

import time
import threading
import multiprocessing

def func():
    result = 0
    for i in xrange(10**8):
        result += i

num_jobs = 10


# 1. measure, how long it takes 10 consistent func() executions 
t = time.time()
for _ in xrange(num_jobs):
    func()
print ("10 consistent func executions takes:\n{:.2f} seconds\n".format(time.time() - t))        

# 2. threading module, 10 jobs
jobs = []
for _ in xrange(num_jobs):
    jobs.append(threading.Thread(target = func, args=()))
t = time.time()
for job in jobs:
    job.start()
for job in jobs:
    job.join()      
print ("10 func executions in parallel (threading module) takes:\n{:.2f} seconds\n".format(time.time() - t))        

# 3. multiprocessing module, 10 jobs
jobs = []
for _ in xrange(num_jobs):
    jobs.append(multiprocessing.Process(target = func, args=()))
t = time.time()
for job in jobs:
    job.start()
for job in jobs:
    job.join()
print ("10  func executions in parallel (multiprocessing module) takes:\n{:.2f} seconds\n".format(time.time() - t))

Results:

10 consistent func executions takes: 25.66 seconds

10 func executions in parallel (threading module) takes: 46.00 seconds

10 func executions in parallel (multiprocessing module) takes: 7.92 seconds

1) Why the implementation with multiprocessing module works better than with threading module?

2) Why consistent func executions takes less time then using threading module?

@AnthonySottile, thanks, but my question is a bit broader the the question of your link — Denis Kuzin, Jul 02 '17 at 14:36
It really isn't -- that question (and the several answers) goes into detail that threading is (generally) going to be slower (for cpu bound operations) because of the global interpreter lock and that both multiprocessing and threading are going to have some startup cost. The reason threading is so slower in this case is due to the GIL and context switching. — anthony sottile, Jul 02 '17 at 14:37

cs95 · Accepted Answer · 2019-01-13T19:16:52.103

Both of your questions can be answered by this excerpt from the docs:

The mechanism used by the CPython interpreter to assure that only one thread executes Python bytecode at a time. This simplifies the CPython implementation by making the object model (including critical built-in types such as dict) implicitly safe against concurrent access. Locking the entire interpreter makes it easier for the interpreter to be multi-threaded, at the expense of much of the parallelism afforded by multi-processor machines.

Bold emphasis mine. You end up spending a lot of time switching between threads to make sure they all run to completion. It's like cutting a rope into small pieces and tying the pieces back together again, one string at a time.

Your results with multiprocessing module are as expected since each process executes in its own address space. These processes are independent of one another, besides being in the same process group and having the same parent process.

Multithreading module slower than multiprocessing in python

1 Answers1