Why is there no influence after using more than 2 processes in Pool?

Question

By using the map function in the multiprocessing library I see no difference in execution time when using more than 2 processes. I'm running the program using 4 cores.

The actual code is pretty straight forward and calculates the first 4000 fibonacci numbers 4 times (= the amount of cores). It distributes the work evenly between N cores (e.g. when using a Pool with 2 processes, each process will calculate the first 4000 fibonacci numbers twice). This whole process is done for N = 1 up to the amount of cores.

The output, with in every row the amount of cores and the corresponding execution time in seconds, is:

3,147
1,72
1,896
1.899

Does anyone know why there is no decrease in execution time given more than 2 cores? The actual code is:

import multiprocessing
from time import time


def F(_):
    for n in range(4 * 10 ** 3):
        a, b = 0, 1
        for i in range(0, n):
            a, b = b, a + b
    return


def pool_fib():
    n_cores = multiprocessing.cpu_count()
    args = list(range(multiprocessing.cpu_count()))
    for i in range(1, n_cores + 1):
        with multiprocessing.Pool(i) as p:
            start = time()
            p.map(F, args)
            print(i, time() - start)


if __name__ == '__main__':
    pool_fib()

Off-topic, but note that for small numbers (like `4 * 10**3`) it's probably clearer to write the number (`4000`) and for large numbers, (`4 * 10**10` maybe) the "e" notation is a literal in Python (`4e10`) though it does produce a float (`for n in range(int(4e10)): ...`) — Adam Smith, Jun 07 '18 at 20:08

Eli Korvigo · Accepted Answer · 2020-01-19T20:00:47.400

Provided you are using a fairly modern CPU, multiprocessing.cpu_count() won't give you the number of physical cores your machine has, but the number of hyper-threads. In a nutshell, hyper-threading allows a single physical core to have n (most commonly, two) pipelines, which fools your OS into thinking you've got n times the number of cores you really have. This is helpful, when you are doing some stuff that might starve a core with data (most notably, IO or RAM lookups caused by cache-misses), but your workload is purely arithmetic and it is not likely to starve your CPU, resulting in little to no gains from hyper-threading. And the little gains you might get will be overshadowed by multiprocessing overhead, which is quite significant.

P.S.

I usually post this sort of things as comments, but I've exceeded the comment size limitation. By the way, if you've chosen the Fibonacci series for something more that just an example, you might want to consider a faster algorithm: Fast Fibonacci computation

This is almost certainly the right answer. Concurrency carries its own overhead, and spinning off too many concurrent processes will harm execution speed, not help it. — Adam Smith, Jun 07 '18 at 20:04
Thanks for the answer. This is indeed correct! Also, thanks for the link but this was indeed just an illustration. — GroovyP, Jun 07 '18 at 20:56

Why is there no influence after using more than 2 processes in Pool?

1 Answers1