Multi-Threading vs Multi-Processing : Which one to select?

Question

I have already asked a question here regarding multi-threading inside multi-processing, results of which are hard to understand and generated one more popular question Multi-threading V/s Multi-Processing.

I have already gone through various post regarding this but none of them clearly answered which to select over the other and not even the methods to check which one suits best for the need. From most of the post, I come to know that Multi-threading is I/O and Multi-processing is CPU bound but when I used both in case of CPU bound process the results are not in favour of the hypothesis that one can blindly pick Multi-threading for I/O and Multi-processing for CPU bound.

As in my case, as the process is CPU bound, the results are in favor of Multi-threading. I have observed that sometime even in CPU bound process multi-threading takes the lead in comparison to multi-processing. I am in search of methodology that helps me to pick one of these to use ?

Below is my analysis where I ran multi-process and multi-threaded code on my Intel i7, 8th Gen, 8-core , 16 GB machine using Python 3.7.2 (Also tested it on Python 3.8.2)

Defining required functions and variables

import numpy as np
import time
import concurrent.futures
a = np.arange(100000000).reshape(100, 1000000)

def list_double_value(x):
  y = []
  for elem in x:
    y.append(2 *elem)
  return y

def double_value(x):
  return 2* x

Case 1 (Using function that take list a input and multiply every elem of it by 2

Multiprocess using list_double_value function (Took 145 Seconds)

t = time.time()

with concurrent.futures.ProcessPoolExecutor() as executor:
  my_results = executor.map(list_double_value, a) # takes a list and double its value
print(time.time()-t)

Multi-Threading using list_double_value function (Took 28 seconds)

t = time.time()

with concurrent.futures.ThreadPoolExecutor() as executor:
  my_results = executor.map(list_double_value, a)
print(time.time()-t)

Case 2 (Using function that takes a value and multiple it by 2)

Multi processing using double value (Took 2.73 Seconds)

t = time.time()

with concurrent.futures.ProcessPoolExecutor() as executor:
  my_results = executor.map(double_value, a)
print(time.time()-t)

Multi Threading using double value (Took 0.2660 Seconds)

t = time.time()

with concurrent.futures.ThreadPoolExecutor() as executor:
  my_results = executor.map(double_value, a)
print(time.time()-t)

Going from the above analysis is this the case that every time before writing code for multi-threading or multi-processing we need to check which perform faster and opt-in for that or is there any set of rules that provide concrete rules to select one over the other ?

Also let me know if all these results are due to the lib concurrent.futures which I used. (I am not sure about the lib also)

@pppery Post is more inclined towards Python and using Python lib so expecting the answer in that context. So rolling back the edit. Thanks for the suggestion. There might be thread that is relevant to other languages but here the purpose is to understand things in context of Python — learner, Jun 19 '20 at 16:59
That's adequately conveyed by the question having the [tag:python] tag. See https://meta.stackexchange.com/q/19190/566903 — pppery, Jun 19 '20 at 17:00
Yes agree but I thought from SEO perspective.(Someone coming from Google). Please feel free to edit if you are not aligned with the thoughts . I will accept it — learner, Jun 19 '20 at 17:02
Read the answer to the question I linked to. "As a matter of fact, the system automatically prepends the most commonly used tag to the question title when generating the page title " — pppery, Jun 19 '20 at 17:03

Chris Johnson · Answer 1 · 2023-02-10T19:09:09.387

2

The performance and scalability of Python is greatly limited by a mechanism deep inside the Python engine called the global interpreter lock or GIL. This is a complex topic, but stated briefly the GIL prevents a single python process from taking full advantage of multiple CPUs. So when you use multi-threading (multiple threads in one process), you’ll see no performance boost from having 2, 4 or 8 CPUs / cores.

Multi-processing is different. In multi-processing, multiple separate Python processes are used (with one thread per process) and each process has its own separate GIL. Each process can run on its own CPU, so your program can effectively use much more of the system’s resources.

Threads are required if you need extremely lightweight tasks, since there is some overhead to each process in the multi-process approach -- each process is a separate Python interpreter. Threads are required for certain kinds of inter-task communication. If you don’t have those needs, you will usually be better off with the multi-processing approach.

Doing multi-threading within multi-processing would be an advanced, somewhat odd approach. I think you’re better off not mixing the modes.

edited Feb 10 '23 at 19:09

answered Jun 19 '20 at 16:52

Chris Johnson

20,650
6
81
80

Thanks you but again the question is if you see https://stackoverflow.com/questions/62469183/multithreading-inside-multiprocessing-in-python?noredirect=1#comment110486100_62469183 I am trying to use Multi-threading inside multi processes ? Theoretically there should be boost in terms of speeds (keeping the data movement between Core aside if we are doing it for very large) ? Your views on this ? – learner Jun 19 '20 at 16:57
The same basic principle applies: a single instance of a Python process will not scale effectively across multiple CPUs or cores when using (OS) threads, because of the GIL. Multi-processing doesn't have this problem; neither does a "green" threads approach. Trying to do multi-threading inside a multi-processing design doesn't change the fact that only one (OS) thread per process will be effective. – Chris Johnson Sep 16 '22 at 17:20
Specifically the GIL means that only one thread within a process can be executing Python bytecodes at a time. Multiple Python threads can still be simultaneously executing C subroutines (e.g. array-math routines inside Numpy). – Jeremy Friesner Feb 10 '23 at 19:13

score -1 · Answer 2 · answered Jun 19 '20 at 16:41

-1

I would suggest using Dask, which lets you setup your computation in a way that is optimized for parallelism. Dask supports multithreading and multiprocessing (and multiple machines), so you can write the code once and try both ways.

https://docs.dask.org/en/latest/

answered Jun 19 '20 at 16:41

Itamar Turner-Trauring

3,430
1
13
17

Thanks for the suggestion.Is this work for single array (or list in Python) instead of putting things in dataframe ? Apart from that I am also interesting in knowing what is happening behind the scenes. Can you please provide answer in context of how dask is managing and deciding which one to pick ? – learner Jun 19 '20 at 16:46

Multi-Threading vs Multi-Processing : Which one to select?

Case 1 (Using function that take list a input and multiply every elem of it by 2

Case 2 (Using function that takes a value and multiple it by 2)

2 Answers2

Linked