0

I have the following code and I want to spread the task into multi-process. After experiments, I realized that increase the number of CPU cores negatively impacts the execution time.

I have 8 cores on my machine

  • Case 1: without using multiprocessing

    • Execution time: 106 minutes
  • Case 2: with multiprocessing using ncores = 4

    • Execution time: 37 minutes
  • Case 3: with multiprocessing using ncores = 7

    • Execution time: 40 minutes

the following code:

import time
import multiprocessing as mp


def _fun(i, args1=10):

    #Sort matrix W

    #For loop 1 on matrix M
    #For loop 2 on matrix Y

    return value

def run1(ncores=mp.cpu_count()):
    ncores = ncores - 4 # use 4 and 1 to have ncores = 4 and 7
    _f = functools.partial(_fun,args1=x)
    with mp.Pool(ncores) as pool:
        result = pool.map(_f, range(n))
    return [t for t in result] 


start = time.time()
list1= run1() 
end = time.time()
print( 'time {0} minutes '.format((end - start)/60))

My question, what is the best practice to use multiprocessing? As I understand that as much we use cpu cores as much it will be faster.

H.H
  • 281
  • 1
  • 4
  • 12
  • Multiprocessing always creates additional overhead. It is not always effective and it really depends on how you are doing it. Because of this https://stackoverflow.com/questions/24376462/why-multiprocessing-is-slow and https://stackoverflow.com/questions/20727375/multiprocessing-pool-slower-than-just-using-ordinary-functions is related. – Confused Learner May 18 '21 at 16:10
  • @ConfusedLearner , for pool.map(_f, range(n)) is this create a new process for each i in range n? or it initially creates ncores processes and then for each process pass a value i to range n – H.H May 18 '21 at 21:02
  • 1
    `Pool.map` takes a list of tasks and splits this into a number of batches equal to the number of cores. However, the splitting can take a very long time, if the list is very big. It tries to find the optimal splitting. You could try to split your list of tasks manually. Additionally, `range` is lazy, so it has to run through the end before it can actually split the tasks. – RaJa May 19 '21 at 05:18
  • @RaJa so it will only initiate a number of tasks = number of cores once, and only pass a batch of data (from the range(n)) to the tasks? is that right? or each time it initiates a task per batch? and destroy the task when finished working on batch to data and create a new task for a new batch? – H.H May 19 '21 at 07:16
  • 1
    Assume you have 4 cores, so `Pool` will create 4 threads/processes. It will also split your list into 4 batches. Each thread will then get a batch, does the work and close itself. – RaJa May 19 '21 at 10:06
  • @RaJa Thanks. So if I have 21k length data, and if I use 4 cores it will take lesser time than if I have 8 cores. Why? – H.H May 19 '21 at 10:09
  • Most likely not. But, if your 21k objects are complex (not just numbers) than splitting 21k into 8 batches might take longer than splitting into 4 batches. Do you have 8 physical cores or is this a 4-core with HT? – RaJa May 19 '21 at 12:46
  • @RaJa print(multiprocessing.cpu_count()) it will print 8 – H.H May 19 '21 at 12:48

0 Answers0