30

To the best of my knowledge, multiple threads can be spawned within the system concurrently but 2 different threads can not access or modify the same resource at the same time. I have even tried many things like creating many threads and putting them in a queue etc. But always I used to hear people say multithreading is not available in Python and that instead you can use multiprocessing to take advantage of multicore CPUs.

I this true? Are Python threads only green threads, not real multithreading? Am I right about the resource locking of Python?

dandan78
  • 13,328
  • 13
  • 64
  • 78
binu.py
  • 1,137
  • 2
  • 9
  • 20
  • 1
    Depends on which implementation of Python you use. – John Gordon Jun 28 '17 at 04:08
  • @JohnGordon can you be specific please? I am just asking is multithreading feature present in python like it is there in java ? – binu.py Jun 28 '17 at 04:09
  • "Python" and "Java" *are languages*. They are abstract things. You need to ask about concrete Python *implementations*. So, are you talking about CPython, PyPy, Jython, IronPython? – juanpa.arrivillaga Jun 28 '17 at 04:26
  • This comment might be helpful: https://realpython.com/python-gil/#:~:text=But%20it%20effectively%20makes%20any%20CPU%2Dbound%20Python%20program%20single%2Dthreaded. – Vishrant Jan 10 '23 at 18:06

3 Answers3

46

Multithreading in Python is sort of a myth.

There's technically nothing forbidding multiple threads from trying to access the same resource at the same time. The result is usually not desirable, so things like locks, mutexes, and resource managers were developed. They're all different ways to ensure that only one thread can access a given resource at a time. In essence, they make threads play nice together. However, if a lot of the threads' time is spent waiting for resources, you're not getting any benefits from multithreading, and you'd be better off writing a single-threaded program instead (or restructuring your program to avoid the waiting).

That being said, in CPython (the most prevalent Python implementation - the one you get from clicking the download button on https://python.org or via a package manager), there's this evil necessity called the Global Interpreter Lock (GIL). In order to make the dynamic memory management in CPython work correctly, the GIL prevents multiple threads from running Python code at the same time. This is because CPython's dynamic memory management is not thread-safe - it can have those same problems of multiple threads accessing (or worse, disposing) the same resource at the same time. The GIL was a compromise between the two extremes of not allowing multi-threaded code, and having the dynamic memory management be very bulky and slow.

Other implementations (like Jython and IronPython, but not PyPy) don't have a GIL, because the platforms they are built on (Java for Jython, .NET for IronPython) handle dynamic memory management differently, and so can safely run the Python code in multiple threads at the same time.

If you're using CPython, it's highly recommended to use the multiprocessing module instead. Rather than running multiple threads, it runs multiple processes (each with their own GIL, so they can all run at the same time). It's much more effective than multithreading. The alternative is to write your multithreaded code in C/C++ as an extension, because native code is not subject to the GIL. However, that's usually a lot more work, and the payoff is usually not worth the effort.


Regarding green threads: they don't implement multithreading in the usual sense. Green threads are closer to coroutines, in that they (usually) can't take advantage of multiple processor cores to run in true parallel. Instead, they typically implement cooperative multitasking, where each green thread will manually pass control to another green thread. Stackless Python has built-in support for green threads, and the greenlet extension brings them to CPython. There are probably other libraries/modules out there that implement green threads, but I'm not familiar with any others.

23

No, Python does have multithreading. In fact, it uses system threads. The problem is just that it can't use more than one of the available cores. This is due to something called the GIL(Global Interpreter Lock). Python threads still work for I/O bound tasks as opposed to CPU bound tasks which may cause deadlocks and race conditions. Many Python libraries solve this issue by using C extensions to bypass the GIL. Of course, this is all in the case of CPython.

There is a very interesting talk about this by one of the core developers of Python.

Thinking about concurrency, Raymond Hettinger

Now you are right, it is much better to use multiprocessing to get the benefit of all the cores. But there are much fewer cores than there are threads. Cores are valuable resources and take up a lot of memory. If you don't mind dealing with IPC(Interprocess Communication), then it is a great solution.

Jarvis
  • 8,494
  • 3
  • 27
  • 58
Vivek Joshy
  • 974
  • 14
  • 37
  • 1
    There's no need to implement IPC for multiprocessing - [Python does it for you](https://docs.python.org/3/library/multiprocessing.html#exchanging-objects-between-processes). –  Jun 28 '17 at 04:28
  • *nitpick* This is specific to the CPython implementation. IronPython runetime and Jython runtime, for example, support multicore parallelism with `threading` – juanpa.arrivillaga Jun 28 '17 at 04:29
  • If python can not use more than one available cores how multiprocessing works? If i am not wrong multiprocessing work on multiple core right? – binu.py Jun 28 '17 at 04:29
  • @binu.py because it spawns *multiple python processes*. A single CPython process can't. – juanpa.arrivillaga Jun 28 '17 at 04:30
  • @juanpa.arrivillaga I'll correct that in the answer. – Vivek Joshy Jun 28 '17 at 04:45
  • @Mego Thanks for correcting me. I meant to say writing multiprocessing code is not as easy as threading because it's not shared memory. – Vivek Joshy Jun 28 '17 at 04:54
-1

Global Interpreter Lock (GIL) has to be taken into account to answer your question. When more number (say k) of threads are created, generally they will not increase the performance by k times, as it will still be running as a single threaded application. GIL is a global lock which locks everything out and allows only single thread execution utilizing only a single core. The performance does increase in places where C extensions like numpy, Network, I/O are being used, where a lot of background work is done and GIL is released. So when threading is used, there is only a single operating system level thread while python creates pseudo-threads which are completely managed by threading itself but are essentially running as a single process. Preemption takes place between these pseudo threads. If the CPU runs at maximum capacity, you may want to switch to multiprocessing. Now in case of self-contained instances of execution, you can instead opt for pool. But in case of overlapping data, where you may want processes communicating you should use multiprocessing.Process.

Chitransh Gaurav
  • 488
  • 4
  • 11