I am new to the multiprocessing
module in python. I have a test code below which calculates the time it takes to execute a function both in sequential form and in a pool of threads:\
from multiprocessing.dummy import Pool as ThreadPool
import timeit
def foo(n):
for i in range(700000):
n += 1
return n
inputs = [1,2, 3, 4]
start = timeit.default_timer()
pool = ThreadPool(4)
results = pool.map(foo, inputs)
pool.close()
pool.join()
print(timeit.default_timer() - start)
print(results)
start = timeit.default_timer()
for i in range(4):
foo(i+1)
print(timeit.default_timer() - start)
This is the result I keep getting:
0.3945475000073202
[700001, 700002, 700003, 700004]
0.2912912300089374
Process finished with exit code 0
How is it that the sequential algorithm is faster then the threaded one? I know overhead is expected when you create threads however I made foo
such that it takes a good amount of time to compute. should multithreading with 4 threads take (roughly) a quarter of the time the sequential algorithm takes?
To me this is really strange because when I test the below algorithm which is basically the same with a difference in the foo
function I get much better results in terms of speed up:
def foo(n):
time.sleep(n)
inputs = [1, 1, 1, 1]
start = timeit.default_timer()
pool = ThreadPool(4)
results = pool.map(foo, inputs)
pool.close()
pool.join()
print("Threaded - " + str(timeit.default_timer() - start))
start = timeit.default_timer()
for i in range(4):
foo(1)
print("sequential - " + str(timeit.default_timer() - start))
with this code the multithreading seems to be working fine:
Threaded - 1.2794823400327004
sequential - 4.011253927950747
Process finished with exit code 0