21

Python 3.2 ALPHA is out.

From the Change Log, it appears the GIL has been entirely rewritten.

A few questions:

  1. Is having a GIL good or bad? (and why).
  2. Is the new GIL better? If so, how?

UPDATE:

I'm fairly new to Python. So all of this is new to my but I do at least understand that the existence of a GIL with CPython is a huge deal.

Question though, why does CPython not just clone the interpreter like Perl does in an attempt to remove the need for the GIL?

JerryK
  • 357
  • 1
  • 4
  • 8
  • 47
    Instead of discussing the GIL, how about something simpler, like the Middle East? :) – Ned Batchelder Aug 02 '10 at 01:09
  • 1
    Part 1 duplicates http://stackoverflow.com/questions/991904/why-is-there-no-gil-in-the-java-virtual-machine-why-does-python-need-one-so-bad/991917#991917 -- Part 2's discussion starts at http://mail.python.org/pipermail/python-dev/2009-October/093321.html . – Alex Martelli Aug 02 '10 at 01:20
  • GIL much more interesting, sorry. – Matt Joiner Aug 02 '10 at 02:19
  • Please delete the first two questions -- they're vague and impossible to answer. Your third question (why doesn't PERL have a GIL) is something that can be answered. – S.Lott Aug 02 '10 at 10:00
  • HOW is this question not flagged and closed!? SO's normally psychotic "close everything" crowd somehow missed this one. Bizarre and impressive! – L0j1k Nov 09 '14 at 08:18

3 Answers3

25

The best explanation I've seen as to why the GIL sucks is here:

http://www.dabeaz.com/python/GIL.pdf

And the same guy has a presentation on the new GIL here:

http://www.dabeaz.com/python/NewGIL.pdf

If that's all that's been done it still sucks - just not as bad. Multiple threads will behave better. Multi-core will still do nothing for you with a single python app.

phkahler
  • 5,687
  • 1
  • 23
  • 31
  • 6
    ...unless you use the multiprocessing module, which is pretty easy to do. – detly Aug 02 '10 at 01:24
  • 10
    ...but multiprocessing is no good for fine-grained parallelism. – Gabe Aug 02 '10 at 04:35
  • 3
    @Gabe: But "fine-grained" parallelism is often over-rated. OS process-level parallelism often works out just. – S.Lott Aug 02 '10 at 13:58
  • 1
    @S.Lott Not really. Not if you actually want to do serious work as opposed to being glue for other things doing serious work. Plus multiprocessing has an overhead in terms of system resources. Then again, if you're doing heavy lifting, I suppose you'd avoid Python in the first place. – Basic Nov 19 '14 at 09:59
  • 1
    Fine-grained parallelism is bad coding. You should parallelise tasks that are independent. – Kobor42 May 25 '15 at 07:48
  • @Kobor42 Can you substantiate the claim that fine-grained parallelism is bad coding? Or perhaps you mean that fine-grained parallelism is not very efficient currently? – Paul Apr 01 '16 at 15:15
  • 2
    @Paul Basic mentioned merge sort, and it is good example for bad fine-grained parallelism. Writing mergesort is easy. Writing safe and efficient parallelised mergesort is hard and complicated. Fine-grained parallelism always demands a good expert. Maintenance is hard. Bugs raise easy, and gets fixed hard. Debugging is hard. Threads always die painful and silent. So thread jobs should be simple and fool-proof. Sorting data, querying data and displaying data are different tasks - make them able to run parallel. – Kobor42 Apr 06 '16 at 15:46
  • @Basic I just answered to Paul's question, but it's also an answer for your comment too. See above - SO doesn't allow multiple replies. – Kobor42 Apr 06 '16 at 15:48
  • @Kobor42 [Thanks for letting me know about the resonse] So your approach is that since writing multi-threaded code requires a minimum level of competence, everyone should avoid it? What would you do instead of a merge sort? Just do it on a single thread and wait longer while the other processors are idle? Or multi-process it and live with significant performance loss? Personally, I'd prefer to hire developers who know how to use resources efficiently – Basic Apr 06 '16 at 16:51
  • 3
    @Basic Please forgive me, that I worked at more big companies on sources with millions lines which were once started as "just try if it works", along with a lot of juniors on projects, and with tons of mistakes like what I described above. Yes. My approach is that since writing good multithreaded code is HARD (not minimum level of competence) it should be best practice to keep it easy, IF (!!!) you plan to use the code longer than a week. – Kobor42 Apr 06 '16 at 17:19
  • @Basic Since Gabe doesn't seem to be active on SO anymore: If my understanding of **fine-grained parallelism** is correct (where threads talk to each other a lot), then why is multiprocessing bad for fine-grained parallelism? Because they generally don't share memory (be default) and are heavier than threads? – Honinbo Shusaku Aug 11 '16 at 12:19
  • @Abdul Mostly. Ignoring the additional resource overhead of processes on windows, passing messages between procs involves a step where data is pickled/unpickled. All processes usually need their own copy of the data set in memory (as no memory is shared), etc, etc... The more granular the work, the greater the overhead and waster resources. merge sort is a good example of this use type. – Basic Aug 11 '16 at 13:11
  • Sure you can work around some of this using memory mapped files and the like but frankly, the whole thing is a hack to work around python's threading deficiencies. I can't help but feel "it's hard" is just a rationalisation. Can you think of other situations where developers avoid whole capabilities because they're difficult? (as opposed to "because the language is poorly designed to support this capability") – Basic Aug 11 '16 at 13:12
  • 1
    @Basic Your question is rhetorical to prove a point, but this reminds me also of memory allocation and de-allocation – Honinbo Shusaku Aug 11 '16 at 13:19
4

Is having a GIL good or bad? (and why).

Neither -- or both. It's necessary for thread synchronization.

Is the new GIL better? If so, how?

Have you run any benchmarks? If not, then perhaps you should (1) run a benchmark, (2) post the benchmark in the question and (3) ask specific questions about the benchmark results.

Discussing the GIL in vague, handwaving ways is largely a waste of time.

Discussing the GIL in the specific context of your benchmark, however, can lead to a solution to your performance problem.

Question though, why does CPython not just clone the interpreter like Perl does in an attempt to remove the need for the GIL?

Read this: http://perldoc.perl.org/perlthrtut.html

First, Perl didn't support threads at all. Older Perl interpreters had a buggy module that didn't work correctly.

Second, the newer Perl interpreter has this feature.

The biggest difference between Perl ithreads and the old 5.005 style threading, or for that matter, to most other threading systems out there, is that by default, no data is shared. When a new Perl thread is created, all the data associated with the current thread is copied to the new thread, and is subsequently private to that new thread!

Since the Perl (only specific data is shared) model is different from Python's (all data is shared) model, copying the Perl interpreter would fundamentally break Python's threads. The Perl thread model is fundamentally different.

S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • 3
    I'm fairly new to Python but have read enough to at least understand the GIL is a big deal, which is why I'm asking the question. – JerryK Aug 02 '10 at 01:32
  • 1
    @JerryK: Your original pair of questions were too vague to provide any useful information, which is why I provided a non-answer. Please try to be **specific** in what you need to know. Vague questions are difficult to answer. – S.Lott Aug 02 '10 at 09:59
  • 2
    JerryK: On the contrary, the GIL is generally not a big deal at all. In the general case it's more useful than painful. – Matt Joiner Aug 03 '10 at 00:37
  • It's "Perl" (the language) or "perl" (the interpreter) but never "PERL". – mpeters Mar 23 '11 at 17:05
  • 1
    @mpeters: I started using perl in the 90's when it was often written PERL because we still thought it was an acronym. Old habits die hard. – S.Lott Mar 23 '11 at 18:20
  • 1
    "It's necessary for thread synchronization"? Prove it. It may be how Python chose to handle thread synch but it's not in Java, .Net, C++ or dozens of other languages which multithread perfectly well. Yes, it prevents people who don't know how to use threads from shooting themselves in the foot and keeps the language design simple. It was a design decision, nothing more (and a poor on IMHO) – Basic Nov 19 '14 at 10:04
1

Is the new GIL better? If so, how?

Well, it at least replaces op-count switching to proper time-count. This does not increase overall performance (and could even hurt it due to more often switching), but this makes threads more responsive and eliminates cases when ALL threads get locked if one of them uses computation-heavy single op-code (like call to external function which does not release GIL).

why does CPython not just clone the interpreter like Perl does in an attempt to remove the need for the GIL?

GIL is complex issue, it should not be viewed as Ultimate Evil. It brings us thread-safety.

As for perl, perl is a) dead, b) too old. Guys at Google are working on bringing LLVM goodies to CPython, which, among others, will improve GIL behavior (no complete GIL removal yet, tho): http://code.google.com/p/unladen-swallow/

Daniel Kluev
  • 11,025
  • 2
  • 36
  • 36