77

I have a Python program that spawns many threads, runs 4 at a time, and each performs an expensive operation. Pseudocode:

for object in list:
    t = Thread(target=process, args=(object))
    # if fewer than 4 threads are currently running, t.start(). Otherwise, add t to queue

But when the program is run, Activity Monitor in OS X shows that 1 of the 4 logical cores is at 100% and the others are at nearly 0. Obviously I can't force the OS to do anything but I've never had to pay attention to performance in multi-threaded code like this before so I was wondering if I'm just missing or misunderstanding something.

Thanks.

Rob Lourens
  • 15,081
  • 5
  • 76
  • 91

3 Answers3

87

Note that in many cases (and virtually all cases where your "expensive operation" is a calculation implemented in Python), multiple threads will not actually run concurrently due to Python's Global Interpreter Lock (GIL).

The GIL is an interpreter-level lock. This lock prevents execution of multiple threads at once in the Python interpreter. Each thread that wants to run must wait for the GIL to be released by the other thread, which means your multi-threaded Python application is essentially single threaded, right? Yes. Not exactly. Sort of.

CPython uses what’s called “operating system” threads under the covers, which is to say each time a request to make a new thread is made, the interpreter actually calls into the operating system’s libraries and kernel to generate a new thread. This is the same as Java, for example. So in memory you really do have multiple threads and normally the operating system controls which thread is scheduled to run. On a multiple processor machine, this means you could have many threads spread across multiple processors, all happily chugging away doing work.

However, while CPython does use operating system threads (in theory allowing multiple threads to execute within the interpreter simultaneously), the interpreter also forces the GIL to be acquired by a thread before it can access the interpreter and stack and can modify Python objects in memory all willy-nilly. The latter point is why the GIL exists: The GIL prevents simultaneous access to Python objects by multiple threads. But this does not save you (as illustrated by the Bank example) from being a lock-sensitive creature; you don’t get a free ride. The GIL is there to protect the interpreters memory, not your sanity.

See the Global Interpreter Lock section of Jesse Noller's post for more details.

To get around this problem, check out Python's multiprocessing module.

multiple processes (with judicious use of IPC) are[...] a much better approach to writing apps for multi-CPU boxes than threads.

-- Guido van Rossum (creator of Python)

Edit based on a comment from @spinkus:

If Python can't run multiple threads simultaneously, then why have threading at all?

Threads can still be very useful in Python when doing simultaneous operations that do not need to modify the interpreter's state. This includes many (most?) long-running function calls that are not in-Python calculations, such as I/O (file access or network requests)) and [calculations on Numpy arrays][6]. These operations release the GIL while waiting for a result, allowing the program to continue executing. Then, once the result is received, the thread must re-acquire the GIL in order to use that result in "Python-land"

Gabriel Grant
  • 5,415
  • 2
  • 32
  • 40
  • 3
    Thank you very much for the detailed answer- `multiprocessing` was it. For anyone else interested, `multiprocessing.Pool` also took care of the problem of limiting the number of active worker threads. – Rob Lourens Dec 22 '10 at 06:11
  • What do I do on Windows then? Multiprocessing sucks on Windows because the child processes don't inherit an object from the memory of the parent process. I want to do a multi-threaded map of a function onto a large list. – John Thompson Apr 02 '13 at 16:54
  • 5
    Great answer. But I'm still not clear about *multiThreading*. Let's say my computer has 4 cores, and I create 4 threads in python code. As I understand, **because of GIL**, these threads will be spawned in *only 1* (physical) core, am I right ? And in other languages, these threads can be spawned in different cores ? I'm not sure how threads be allocated in physical cores. Are threads strictly created in the same core or it's dependent on something else (e.g., operation system, programming languages,...). Thank you. – Chau Pham Oct 31 '18 at 18:35
  • 3
    @Catbuilts Python doesn't dictate which physical cores the threads are created on -- that is controlled by the OS. What the GIL does is limit the work that the threads do at the Python layer: only one thread is allowed to modify the state of the Python interpreter at a time, so any additional threads trying to do so will sit idle until it's their turn to operate. – Gabriel Grant Jan 04 '19 at 15:51
  • If having a GIL completely thwarts any ability to utilize threads concurrently in Python, makes me wonder why is there a "threading" std library at all. People want to do event driven programming like this? Also rel the van Rossum quote: A. link is dead, and B. misguided because a key advantage of in threads over multi process is you don't need any IPC mechanism. – spinkus Jan 02 '23 at 05:36
  • @spinkus Good question! Threads can still be very useful in Python when doing simultaneous operations that do not need to modify the interpreter's state. This includes most long-running operations that are not in-Python calculations, such as file access or network requests. I've updated the answer to address this (and fixed the link to point at a wayback machine snapshot) – Gabriel Grant Jan 06 '23 at 21:39
9

Python has a Global Interpreter Lock, which can prevent threads of interpreted code from being processed concurrently.

http://en.wikipedia.org/wiki/Global_Interpreter_Lock

http://wiki.python.org/moin/GlobalInterpreterLock

For ways to get around this, try the multiprocessing module, as advised here:

Does running separate python processes avoid the GIL?

Community
  • 1
  • 1
T.R.
  • 7,442
  • 6
  • 30
  • 34
  • 1
    Multiple Processes does not suffer from the GIL, because every process has its own GIL and also its own memory. – Sven Apr 28 '17 at 16:17
  • 1
    @Sven: Thanks for your info. I'm wondering that in other programming languages which dont use GIL, can threads run on multi processes ? For example, creating a program with 4 threads on 4-core computer, do these threads execute on all the four cores? Does the reason that all threads in python just be spawned on 1 core lies on *GIL*? – Chau Pham Oct 31 '18 at 18:49
  • @Catbuilts That's right; most other languages allow threads to run concurrently on multiple cores of the same computer. I can somewhat understand the reasons why Python doesn't allow it "by design", but I don't approve of them; there are quite a few compute intensive use-cases where "real" threads would literally beat the multi-processes counterpart, in time and RAM usage. – Fravadona Mar 25 '23 at 20:57
6

AFAIK, in CPython the Global Interpreter Lock means that there can't be more than one block of Python code being run at any one time. Although this does not really affect anything in a single processor/single-core machine, on a mulitcore machine it means you have effectively only one thread running at any one time - causing all the other core to be idle.

MAK
  • 26,140
  • 11
  • 55
  • 86