63

I want to create a non-thread-safe chunk of code for experimentation, and those are the functions that 2 threads are going to call.

c = 0

def increment():
  c += 1

def decrement():
  c -= 1

Is this code thread safe?

If not, may I understand why it is not thread safe, and what kind of statements usually lead to non-thread-safe operations.

If it is thread-safe, how can I make it explicitly non-thread-safe?

danijar
  • 32,406
  • 45
  • 166
  • 297
nubela
  • 1
  • 24
  • 75
  • 123
  • 5
    There should be a `global c` deceleration at the start of each function or this doesn't really do anything. – JoshB Sep 01 '16 at 08:04
  • Hi nubela, can you pick the correct answer so that future readers don't get confused? – laike9m Feb 18 '20 at 00:29

8 Answers8

128

No, this code is absolutely, demonstrably not threadsafe.

import threading

i = 0

def test():
    global i
    for x in range(100000):
        i += 1

threads = [threading.Thread(target=test) for t in range(10)]
for t in threads:
    t.start()

for t in threads:
    t.join()

assert i == 1000000, i

fails consistently.

i += 1 resolves to four opcodes: load i, load 1, add the two, and store it back to i. The Python interpreter switches active threads (by releasing the GIL from one thread so another thread can have it) every 100 opcodes. (Both of these are implementation details.) The race condition occurs when the 100-opcode preemption happens between loading and storing, allowing another thread to start incrementing the counter. When it gets back to the suspended thread, it continues with the old value of "i" and undoes the increments run by other threads in the meantime.

Making it threadsafe is straightforward; add a lock:

#!/usr/bin/python
import threading
i = 0
i_lock = threading.Lock()

def test():
    global i
    i_lock.acquire()
    try:
        for x in range(100000):
            i += 1
    finally:
        i_lock.release()

threads = [threading.Thread(target=test) for t in range(10)]
for t in threads:
    t.start()

for t in threads:
    t.join()

assert i == 1000000, i
Glenn Maynard
  • 55,829
  • 10
  • 121
  • 131
  • 16
    Much more helpful than the accepted answer. Thanks! – RoboCop87 Jul 02 '14 at 14:34
  • 5
    Up-voted. Your lock example would be more illustrative if the lock were acquired and released for each increment instead of each 100,000 increments. Why even bother with threads if they are going to execute sequentially with no overlap whatsoever? – MarredCheese Mar 19 '17 at 04:15
  • @MarredCheese because this is just exploring the features of the language. In real workloads there are always other intertwined things happening that interact with the locked thing only at specific points in time. – Liz Av Aug 18 '21 at 13:40
  • Is there a reason why `global i` is declared but not `global i_lock`? – Eric McLachlan Aug 21 '21 at 13:09
  • 1
    @EricMcLachlan (from almost a year ago--sorry, I often don't see SO comments) i is assigned to, so it needs to be explicitly declared global or else it would bind as a local. i_lock is only accessed and not assigned to, so there's no need to declare it global. I only declare globals when it's required, not in every place they're accessed. – Glenn Maynard Jun 30 '22 at 06:10
  • @GlennMaynard Well, works fine without lock in python 3.10 at least. – EzR1d3r Nov 16 '22 at 12:58
  • @EzR1d3r Reproduced the assertion error in python3.9. Can't reproduce in Jupyter, but can reproduce (using larger numbers for counter and thread number) when using regular Linux shell python. – max May 28 '23 at 09:24
33

(note: you would need global c in each function to make your code work.)

Is this code thread safe?

No. Only a single bytecode instruction is ‘atomic’ in CPython, and a += may not result in a single opcode, even when the values involved are simple integers:

>>> c= 0
>>> def inc():
...     global c
...     c+= 1

>>> import dis
>>> dis.dis(inc)

  3           0 LOAD_GLOBAL              0 (c)
              3 LOAD_CONST               1 (1)
              6 INPLACE_ADD         
              7 STORE_GLOBAL             0 (c)
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE        

So one thread could get to index 6 with c and 1 loaded, give up the GIL and let another thread in, which executes an inc and sleeps, returning the GIL to the first thread, which now has the wrong value.

In any case, what's atomic is an implementation detail which you shouldn't rely on. Bytecodes may change in future versions of CPython, and the results will be totally different in other implementations of Python that do not rely on a GIL. If you need thread safety, you need a locking mechanism.

bobince
  • 528,062
  • 107
  • 651
  • 834
18

To be sure I recommend to use a lock:

import threading

class ThreadSafeCounter():
    def __init__(self):
        self.lock = threading.Lock()
        self.counter=0

    def increment(self):
        with self.lock:
            self.counter+=1


    def decrement(self):
        with self.lock:
            self.counter-=1

The synchronized decorator can also help to keep the code easy to read.

gillesv
  • 181
  • 1
  • 2
15

It's easy to prove that your code is not thread safe. You can increase the likelyhood of seeing the race condition by using a sleep in the critical parts (this simply simulates a slow CPU). However if you run the code for long enough you should see the race condition eventually regardless.

from time import sleep
c = 0

def increment():
  global c
  c_ = c
  sleep(0.1)
  c = c_ + 1

def decrement():
  global c
  c_ = c
  sleep(0.1)
  c  = c_ - 1
John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • Using sleep for this kind of stuff is very wrong. How did you come up with the 0.1 value? would a faster processor need longer sleep? Using sleep for solving problems is almost always wrong. – omribahumi Mar 26 '15 at 19:52
  • 8
    @omribahumi, what? I think you're confused by the purpose of my answer. This code is an _example_ of how easy it is to _prove_ a particular piece of code isn't thread safe. The sleep is merely there as a placeholder to _simulate_ extra processing that would normally be there. If you meant that using sleep is the wrong way to avoid race conditions, I certainly agree, but that's not what my answer claims. – John La Rooy Mar 27 '15 at 04:07
  • 2
    @jacmkno, The answer isn't wrong, but has confused people for some reason. It proves that the OP's code is __not__ thread safe. Or are you suggesting otherwise? – John La Rooy Nov 04 '15 at 04:13
  • 3
    Up-voted this purely because you seem to have been punished for other people not reading your answer... Makes sense to me – Hayden Crocker Aug 28 '16 at 10:29
5

Short answer: no.

Long answer: generally not.

While CPython's GIL makes single opcodes thread-safe, this is no general behaviour. You may not assume that even simple operations like an addition is a atomic instruction. The addition may only be half done when another thread runs.

And as soon as your functions access a variable in more than one opcode, your thread safety is gone. You can generate thread safety, if you wrap your function bodies in locks. But be aware that locks may be computationally costly and may generate deadlocks.

ebo
  • 8,985
  • 3
  • 31
  • 37
2

If you actually want to make your code not thread-safe, and have good chance of "bad" stuff actually happening without you trying like ten thousand times (or one time when you real don't want "bad" stuff to happen), you can 'jitter' your code with explicit sleeps:

def íncrement():
    global c
    x = c
    from time import sleep
    sleep(0.1)
    c = x + 1
Rasmus Kaj
  • 4,224
  • 1
  • 20
  • 23
0

Single opcodes are thread-safe because of the GIL but nothing else:

import time
class something(object):
    def __init__(self,c):
        self.c=c
    def inc(self):
        new = self.c+1 
        # if the thread is interrupted by another inc() call its result is wrong
        time.sleep(0.001) # sleep makes the os continue another thread
        self.c = new


x = something(0)
import threading

for _ in range(10000):
    threading.Thread(target=x.inc).start()

print x.c # ~900 here, instead of 10000

Every resource shared by multiple threads must have a lock.

Jochen Ritzel
  • 104,512
  • 31
  • 200
  • 194
  • 9
    This does not answer the question, which is about `+=` – Marcin Owsiany Dec 08 '16 at 14:53
  • Also, and correct me if I'm wrong, `print x.c` doesn't wait for the threads to finish. So most of them are still running when you print the output. – Lowie Huyghe Dec 14 '16 at 09:56
  • 1
    Do you want to update the answer mentioning that thread safety is an issue only when dealing with shared/global variables. In your example, x is a global variable. – variable Nov 06 '19 at 07:59
0

Are you sure that the functions increment and decrement execute without any error?

I think it should raise an UnboundLocalError because you have to explicitly tell Python that you want to use the global variable named 'c'.

So change increment ( also decrement ) to the following:

def increment():
    global c
    c += 1

I think as is your code is thread unsafe. This article about thread synchronisation mechanisms in Python may be helpful.

ardsrk
  • 2,407
  • 2
  • 21
  • 33