I am doing heavy image processing in python3 on a large batch of images using numpy and opencv. I know python has this GIL which prevents two threads running concurrently. A quick search on Google told me that, do not use threads in python for CPU intensive tasks, use them only for I/O or saving files to disk, database communication etc. I also read that GIL is released when working with C extensions. Since both numpy and opencv are C and C++ extensions I get a feeling that GIL might be released.I am not sure about it because image processing is a CPU intensive task. Is my intuition correct or I am better of using multiprocessing?
Asked
Active
Viewed 1,742 times
7
-
This [question](https://stackoverflow.com/questions/32775555/how-to-use-python-and-opencv-with-multiprocessing) has some information on opencv and multi-processing for python. – Eolmar Apr 25 '18 at 05:10
-
Tat question doesn't seem to solve my doubt – Abhijit Balaji Apr 25 '18 at 06:47
-
Nobody can answer without more details on what you are doing. People use mutliprocessing and it works. Maybe using thread is faster, but all the python code will be single threaded. If the execution spend a lot of time in a few opencv functions call, then threads could be ok. – Eolmar Apr 25 '18 at 07:35
-
1At least for OpenCV it would appear to be the case -- e.g. `imread`s will happily run in parallel on several threads, similarly offloading `VideoWriter` to another thread again increases overall throughput. Looking at the code, they explicitly reacquire GIL in the few cases where Python objects are manipulated (e.g. highgui callbacks). – Dan Mašek Apr 25 '18 at 11:26
1 Answers
5
To answer it upfront, it depends on the functions you use.
The most effective way to prove if a function releases the GIL is by checking the corresponding source. Also checking the documentation helps, but often it is simply not documented. And yes, it is cumbersome.
http://scipy-cookbook.readthedocs.io/items/Multithreading.html
[...] numpy code often releases the GIL while it is calculating, so that simple parallelism can speed up the code.
Each project might use their own macro, so if you are familiar with the default macros like Py_BEGIN_ALLOW_THREADS from the C Python API, you might find them being redefined. In Numpy for instance it would be NPY_BEGIN_THREADS_DEF
, etc.

HelloWorld
- 2,392
- 3
- 31
- 68