9

I saw examples e.g. here of using an Event to stop a thread where I think a boolean flag would do the job.

 Event

class MyThread(threading.Thread):

    def __init__(self):
        self._please_stop = threading.Event()

    def run(self):
        while not self._please_stop.is_set():
        [...]

    def stop(self):
        self._please_stop.set()

 Flag

class MyThread(threading.Thread):

    def __init__(self):
        self._please_stop = False

    def run(self):
        while not self._please_stop:
        [...]

    def stop(self):
        self._please_stop = True

What is the benefit of using an Event, here? Its wait method is not used. What makes it better than a boolean flag?

I can see the point if the same Event is shared among several threads, but otherwise, I don't get it.

This mailing list thread suggests that Event would be safer, but it's unclear to me why.

More precisely, I don't understand those two paragraphs:

If I understand the GIL correctly, it synchronizes all access to Python data structures (such as my boolean 'terminated' flag). If that is the case, why bother using threading.Event for this purpose?

The GIL is an implementation detail and relying on it to synchronize things for you isn't futureproof. You're likely to have lots of warning, but using threading.Event() isn't any harder, and it's more correct and safer in the long term.

I agree that using an Event adds close to no overhead, so I can stick to that, but I'd like to understand the limits of the flag approach.

(I'm using Python3, so I'm not concerned by Python2 limitations, if any, although those would be totally worth mentioning here.)

Community
  • 1
  • 1
Jérôme
  • 13,328
  • 7
  • 56
  • 106
  • Rather than write your own thread loop, why not use ThreadPoolExecutor or something similar? – clay May 09 '17 at 20:31
  • @clay. That has nothing to do with the point of the question. It is useful to understand how abstractions work for those cases when they break down. – Mad Physicist May 09 '17 at 20:32
  • @MadPhysicist Python `ThreadPoolExecutor` has this exact behavior and functionality. I would presume that the official standard library has the best practices way to stop a thread. – clay May 09 '17 at 20:34
  • Actually neither is great. Imagine that your thread code needs to delay (a.k.a. `time.sleep()`) how will the thread be terminated in a timely manner? – Dima Tisnek May 12 '17 at 07:33

2 Answers2

6

Programming is often not just about getting the code to work today, it's about keeping it working through changes that will be made in the future.

  • Other Python implementations don't have the GIL. Will I want to run it on pypy tomorrow?
  • Actually, I need to spread the work across several processes. Swap in multiprocessing... which implements Event() but will fail if you're just using a local variable.
  • Turns out the code should stop only if several other threads think it should. Well, use Semaphore() instead of Event()... but would be easy to implement incorrectly with variables.

So, it's likely you can write multithreaded programs perfectly correctly in Python relying on how the bytecode gets interrupted and when the GIL can be released... but if I am reading and changing your code later, I'd be much happier if you used the standard synchronization primitives.

gz.
  • 6,661
  • 1
  • 23
  • 34
  • Thanks. As I wrote in reply to [Mad Physicist's answer](http://stackoverflow.com/a/43879450/4653485), the problem here was that I always assumed setting a boolean was atomic, so I didn't get the GIL issue. I totally agree relying on a single implementation is not ideal, especially if doing otherwise is that cheap. And I also agree about futureproofing code (hence my question). – Jérôme May 09 '17 at 20:50
2

I think that the implication in the thread you quote is that setting a boolean is not necessarily an atomic operation in Python. While having a global lock on all Python objects (the GIL) makes all operations that set an attribute appear atomic for the moment, such a lock may not exist in the future. Using Event makes the operation atomic because it uses its own lock for access.

The link for atomic is to a Java question, but it is no less relevant because of that.

Community
  • 1
  • 1
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • The missing part was the fact that setting the boolean is not necessarily atomic. I always assumed it was. I shall read more about the GIL. This explains the Lock in Event. Searching with the right terms, I found [another question](http://stackoverflow.com/questions/23130399/python-threads-and-atomic-operations#23130849) that my question duplicates. – Jérôme May 09 '17 at 20:47
  • I believe that the standard operating procedure would be to close your question as a duplicate now that you found that other question. – Mad Physicist May 09 '17 at 20:49
  • 1
    It actually is atomic in your context, where you know what Python you're using. So, it's more of a code style issue rather than a code correctness one. – gz. May 09 '17 at 20:50
  • @gz. Given that `__setattr__` has not been reimplemeted here, you are technically correct. – Mad Physicist May 09 '17 at 20:52
  • @gz. Although it is a correctness issue since it is not guaranteed that you will always run with the same version of Python. The point is that there is no *contractual* obligation for it to be atomic. That being said, your example with the sempaphore is much easier to understand intuitively. – Mad Physicist May 09 '17 at 20:53
  • 1
    @mad-physicist But technically correct is the best kind of correct! Until your code implodes. :) – gz. May 09 '17 at 20:56
  • 1
    @gz. You can always create a contractual obligation by putting a big notice for your project saying something like "This code has only been verified to run correctly with interpreter X". – Mad Physicist May 09 '17 at 20:59