I define a cpu-bound function
def countdown(n):
while n > 0:
n -= 1
running countdown(50000000)
takes 2.16 seconds on my laptop.
First, I test multiprocess
parallelization.
from multiprocess import Pool
with Pool(2) as p:
l=p.map(countdown,[50000000,50000000])
takes 2.46 seconds, which is a good parallelization.
Then, I test dask
processes scheduler parallelization
l=[dask.delayed(countdown)(50000000),dask.delayed(countdown)(50000000)]
dask.compute(l,scheduler='processes',num_workers=2)
however, it takes 4.53 seconds. This is the same speed as
dask.compute(l,scheduler='threads',num_workers=2)
What is wrong with dask processes scheduler? I expected it should be on a par with multiprocess