0

I am trying to run 5 parallel processes and have also divided the list in 5 blocks but it takes more time than linearly processing in a single process, I have tried other solutions on Stackoverflow but the results are same. If I run the code, this is what I get:

result

import multiprocessing as mp
import time


def raiseNum(pwr, nums):
    for v in nums:
        x = v**pwr 


if __name__ == '__main__':

    # Let's get the cube of all numbers in the list nums
    nums = [i for i in range(0, 100000000, 1)]
    pwr = 3

    ## linear processing
    start = time.time()

    results_l = raiseNum(pwr, nums)

    end = time.time()

    print('Linear Processing time: ', str(end-start), 'Seconds')
    


    ## Parallel processing: 5 Processes
    
    # divide nums list in 5 parts
    blockSize = len(nums)//5
    
    numsBlocks = [nums[i: i+blockSize] for i in range(0, len(nums), blockSize)]

    processList = []

    p0 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[0]))
    p1 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[1]))
    p2 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[2]))
    p3 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[3]))
    p4 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[4]))

    start = time.time()

    p0.start()
    p1.start()
    p2.start()
    p3.start()
    p4.start()

    p0.join()
    p1.join()
    p2.join()
    p3.join()
    p4.join()

    end = time.time()

    print('5 Parallel Processes time: ', str(end-start), 'Seconds')
user3666197
  • 1
  • 6
  • 50
  • 92
AashishKSahu
  • 335
  • 2
  • 15
  • Does this answer your question? [multiprocessing.Pool() slower than just using ordinary functions](https://stackoverflow.com/questions/20727375/multiprocessing-pool-slower-than-just-using-ordinary-functions) – Cow Mar 16 '22 at 13:16
  • I have also created multiple processes separately, but t's not working – AashishKSahu Mar 16 '22 at 13:28
  • 1
    For me the multiprocessing one is faster, on an older 4-core i5, though only by 25% (40s vs 30s). If I put logging into `raiseNum()` too, I can see how far apart they start, and also that their individual runtime is 15s. The delay between launches shows that a considerable time is spent on communicating the numbers themselves. Side note: you need some amount of free memory too, when running the first test, python uses 4GB, and then the 5 sub-processes use another 4GB combined. The net 37% (15s) runtime for 20%-sized tasks can be partially blamed on cache misses I think. – tevemadar Mar 16 '22 at 13:42

1 Answers1

2

The optimization depends both on your hardware as well as the computation. When I ran your same code on my machine I got the following results:

Linear Processing time:  27.75643253326416 sec
5 Parallel Processes total time:  7.949779510498047 sec
Computation time:  1.7256593704223633 sec
Join time:  6.224120140075684 sec 

Two points to note:

  • The parallel execution is almost 3.5 times faster. (I tested your code on a machine with 24 cores)
  • When we split the parallel processing into two parts, we can notice that join takes considerably longer (as the computation is not that intensive).

Here is the code, where I added a couple of extra print statements towards the end:

import multiprocessing as mp
import time


def raiseNum(power, numbers):
    for v in numbers:
        x = v ** power


if __name__ == '__main__':
    # Let's get the cube of all numbers in the list nums
    nums = [i for i in range(0, 100000000, 1)]
    pwr = 3

    # linear processing
    start = time.time()
    raiseNum(pwr, nums)
    end = time.time()
    print('Linear Processing time: ', str(end - start), 'sec')

    # Parallel processing: 5 Processes
    # divide nums list in 5 parts
    blockSize = len(nums) // 5
    numsBlocks = [nums[i: i + blockSize] for i in range(0, len(nums), blockSize)]
    processList = []
    p0 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[0]))
    p1 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[1]))
    p2 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[2]))
    p3 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[3]))
    p4 = mp.Process(target=raiseNum, args=(pwr, numsBlocks[4]))

    start1 = time.time()

    p0.start()
    p1.start()
    p2.start()
    p3.start()
    p4.start()

    start2 = time.time()

    p0.join()
    p1.join()
    p2.join()
    p3.join()
    p4.join()

    end = time.time()

    print('5 Parallel Processes total time: ', str(end - start1), 'sec')
    print('Computation time: ', str(start2 - start1), 'sec')
    print('Join time: ', str(end - start2), 'sec')

Hope this helps with better optimization.

Swaroop
  • 1,219
  • 3
  • 16
  • 32
  • 1
    Thanks! this explains it. My CPU is old and probably puts the processes in wait because it's dual core, also, I am getting a good execution time if I don't consider the join time – AashishKSahu Mar 16 '22 at 13:37