Here is some test of multiprocessing.Pool
vs multiprocessing.pool.ThreadPool
vs sequential version, I wonder why multiprocessing.pool.ThreadPool
version is slower than sequential version?
Is it true that multiprocessing.Pool
is faster because it use processes (i.e. without GIL) and multiprocessing.pool.ThreadPool
use threads(i.e. with GIL) despite the name of the package multiprocessing
?
import time
def test_1(job_list):
from multiprocessing import Pool
print('-' * 60)
print("Pool map")
start = time.time()
p = Pool(8)
s = sum(p.map(sum, job_list))
print('time:', time.time() - start)
def test_2(job_list):
print('-' * 60)
print("Sequential map")
start = time.time()
s = sum(map(sum, job_list))
print('time:', time.time() - start)
def test_3(job_list):
from multiprocessing.pool import ThreadPool
print('-' * 60)
print("ThreadPool map")
start = time.time()
p = ThreadPool(8)
s = sum(p.map(sum, job_list))
print('time:', time.time() - start)
if __name__ == '__main__':
job_list = [range(10000000)]*128
test_1(job_list)
test_2(job_list)
test_3(job_list)
Output:
------------------------------------------------------------
Pool map
time: 3.4112906455993652
------------------------------------------------------------
Sequential map
time: 23.626681804656982
------------------------------------------------------------
ThreadPool map
time: 76.83279991149902