3

I know that the builtin set type in python is not generally threadsafe, but this answer claims that it is safe to call pop() from two competing threads. Sure, you might get an exception, but your data isn't corrupted. I can't seem to find a doc that validates this claim. Is it true? Documentation, please!

Community
  • 1
  • 1
Stuart Berg
  • 17,026
  • 12
  • 67
  • 99
  • Looking at the Python source code, `set` objects are just dictionaries with some convenient methods. – Blender Mar 28 '12 at 15:49
  • I think that the [answer you're looking for](http://stackoverflow.com/a/2227210/1132524) is exacly under the one you pointed out. Read the comment and check out what the [GIL](http://wiki.python.org/moin/GlobalInterpreterLock) is. – Rik Poggi Mar 28 '12 at 15:50
  • Same question you link says that mutable types are not thread-safe: http://stackoverflow.com/a/2227220/104847 you have to implement locking mechanisms so you don't have race conditions. – Ale Mar 28 '12 at 15:53
  • 4
    It's not thread-safe, but due to the global interpreter lock it will work in CPython (and PyPy, for example). You shouldn't rely on this, though, as it's implementation specific and doesn't hold true in other implementations like IronPython. – Niklas B. Mar 28 '12 at 15:57
  • See also: the Python docs on the [Global Interpreter Lock](http://docs.python.org/c-api/init.html#thread-state-and-the-global-interpreter-lock) and this related [link](http://effbot.org/pyfaq/what-kinds-of-global-value-mutation-are-thread-safe.htm). – jjlin Mar 28 '12 at 17:44

2 Answers2

10

If you look at the set.pop method in the CPython source you'll see that it doesn't release the GIL.

That means that only one set.pop will ever be happening at a time within a CPython process.

Since set.pop checks if the set is empty, you can't cause anything but an IndexError by trying to pop from an empty set.

So no, you can't corrupt the data by popping from a set in multiple threads with CPython.

agf
  • 171,228
  • 44
  • 289
  • 238
  • 1
    Accepting this answer since it solves my immediate problem. Note to readers from the fffuuuutttuuuurree: Please note the comment from @Niklas (above). – Stuart Berg Mar 30 '12 at 15:27
0

I believe the Set "pop" operation to be thread-safe due to being atomic, in the sense that two threads won't be able to pop the same value.

I wouldn't rely on it's behavior if another thread was, for instance, iterating over that collection.

I couldn't find any concrete documentation either, just some topics that point in this direction. Python official documentation would indeed benefit with information of this kind.

pcalcao
  • 15,789
  • 1
  • 44
  • 64
  • Check out the source link in my answer. In CPython, `set.next` and `set.pop` can't both be happening at the same time, so the worst that can happen is what happens if you remove an item from a sequence while iterating over it in a `for` loop -- you skip an item. – agf Mar 28 '12 at 17:08