Python GIL/Race conditions

Question

I'm trying to understand the Python GIL. My understanding is that since Python is compiled down into Python bytecode, when I have two threads decrementing a variable at the same time, theoretically, a race condition could occur. I'm wondering if this is the case, because I have been running the following code:

from threading import Thread
BIG_NUMBER = 500000000
count = BIG_NUMBER

def dec(n):
    global count
    for _ in range(n):
        count -= 1
t1 = Thread(target=dec, args=(BIG_NUMBER // 2,))
t2 = Thread(target=dec, args=(BIG_NUMBER // 2,))
t1.start()
t2.start()
t1.join()
t2.join()
print(count) # I have been getting 0 multiple times

Since BIG_NUMBER is set to 500,000,000, I would assume that a race condition could occur at least once and count would be indeterminate. But I keep getting 0?

GIL would indeed prevent that as threads won't be running in parallel. — MYousefi, Apr 30 '22 at 20:18
@MYousefi Although some people can't wait for Python to “drop” the GIL, the GIL makes multi-threading programming as simple as it can get. The GIL will ensure that there are no race conditions as long as operations are atomic i.e., thread-safe. --> Why does Python provide locking mechanisms if it's subject to a GIL : https://stackoverflow.com/questions/26873512/why-does-python-provide-locking-mechanisms-if-its-subject-to-a-gil — pippo1980, Nov 13 '22 at 19:29

score 1 · Answer 1 · answered Nov 09 '22 at 20:48

Among other things, the GIL ensures that only one thread at a time is executing Python bytecode. So operations that take one bytecode cannot be interrupted.

Let's use the dis module to look at your function:

In [1]: import dis

In [2]: def dec(n):
   ...:     global count
   ...:     for _ in range(n):
   ...:         count -= 1
   ...:         

In [3]: dis.dis(dec)
  3           0 LOAD_GLOBAL              0 (range)
              2 LOAD_FAST                0 (n)
              4 CALL_FUNCTION            1
              6 GET_ITER
        >>    8 FOR_ITER                12 (to 22)
             10 STORE_FAST               1 (_)

  4          12 LOAD_GLOBAL              1 (count)
             14 LOAD_CONST               1 (1)
             16 INPLACE_SUBTRACT
             18 STORE_GLOBAL             1 (count)
             20 JUMP_ABSOLUTE            8
        >>   22 LOAD_CONST               0 (None)
             24 RETURN_VALUE

From LOAD_GLOBAL at 12 to STORE_GLOBAL at 18 is four bytecode instructions. One complete loop iteration takes seven bytecode instructions. So technically, it is not an atomic operation.

The question is how often would a thread be interrupted in these four instructions? For that we have to consider thread scheduling.

From a talk by David Beazley ("Embracing the Global Interpreter Lock"), I recall that a thread that wants the GIL in Python 3 must wait for at least 5 ms. How many loop iterations you can run on a machine will obviously vary a lot. This is what I get on my machine.

In [1]: BIG_NUMBER = 500000000
Out[1]: 500000000

In [2]: count = BIG_NUMBER
Out[2]: 500000000

In [3]: def dec(n):
   ...:     global count
   ...:     for _ in range(n):
   ...:         count -= 1
   ...:         

In [4]: %timeit dec(1000)
68.8 µs ± 1.34 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

So on my machine 5 ms would decrease the counter by about 100.

But still, four out of seven instructions in the loop need to run to maintain consistency. So a race condition should technically be possible, if the thread is stopped at a random instruction.

Ömer Faruk Özdemir · Answer 2 · 2022-11-09T21:38:48.883

0

I am also thinking about this nowadays. I don't know whether if decremention is single byte-code instruction or not. But if it is not, it is certainly possible to arrive at a nonzero result. Though we don't know the possibility of it. Probably would happen more regularly with high number of threads.

Also, a useful answer about consistency in Python: https://stackoverflow.com/a/23855229/15282482

edited Nov 09 '22 at 21:38

answered Nov 09 '22 at 20:14

Ömer Faruk Özdemir

9
2

1

Why would more threads increase the probability? I don't think this is the case, since the switching interval would determine the probability, and that is independent of the number of threads. – wovano Nov 14 '22 at 10:21
what if lets say I have 2 cores assigned to my script via `os.sched_setaffinity(0, {0,1})` but script uses 8 threads ? would that increase the probability ? – pippo1980 Nov 14 '22 at 13:46
in any case won't P(A or B) = P(A) + P(B) - P(A and B) > P(A) or P(B) ? – pippo1980 Nov 14 '22 at 13:59
@pippo1980, `sched_setaffinity` does not assign 2 cores to Python, it restricts the Python process to run on that 2 cores only. It will still run on one core at the same time, so I don't think it will make any difference (other than that _any_ change to the code or system could potentially change the timing behavior, so _could_ lead to different results). – wovano Nov 14 '22 at 14:52
1

If you have a new question, please ask it by clicking the [Ask Question](https://stackoverflow.com/questions/ask) button. Include a link to this question if it helps provide context. - [From Review](/review/late-answers/33145438) – cafce25 Nov 15 '22 at 08:41

pippo1980 · Answer 3 · 2022-11-14T17:05:35.020

this is not an answer, but was difficult to me to understand the question , so I slighty modified the code here; what I ended up with trying to understand what

print(count) # I have been getting 0 multiple times

means, in the OP question. Hope I got it right:

from threading import Thread

# BIG_NUMBER = 500000000

BIG_NUMBER = 5  #messo da me se no troppo lento

count = BIG_NUMBER

def dec(n):
    global count
    for _ in range(n):
        count -= 1


results =  []

for i in range(10):
    
    BIG_NUMBER *= i+1
    
    count = BIG_NUMBER
    print(BIG_NUMBER, count)
    
    t1 = Thread(target=dec, args=(BIG_NUMBER // 2,))
    t2 = Thread(target=dec, args=(BIG_NUMBER // 2,))
    t1.start()
    t2.start()
    t1.join()
    t2.join()
    # print(count) # I have been getting 0 multiple times
    
    results.append(count)
    
print(results)

output:

5 5
10 10
30 30
120 120
600 600
3600 3600
25200 25200
201600 201600
1814400 1814400
18144000 18144000
[1, 0, 0, 0, 0, 0, 0, 46670, 557348, 6385685]

adding :

import sis

sys.setswitchinterval(0.5)
print(' sys.getswitchinterval() : ', sys.getswitchinterval())

at the beginnig of code, I got results more in the likenesss of:

output:

5 5
10 10
30 30
120 120
600 600
3600 3600
25200 25200
201600 201600
1814400 1814400
18144000 18144000
[1, 0, 0, 0, 0, 0, 0, 0, 0, 2626417]

see sys.setswitchinterval(interval)

sys.setswitchinterval(interval)

Set the interpreter’s thread switch interval (in seconds). This floating-point value determines the ideal duration of the “timeslices” allocated to concurrently running Python threads. Please note that the actual value can be higher, especially if long-running internal functions or methods are used. Also, which thread becomes scheduled at the end of the interval is the operating system’s decision. The interpreter doesn’t have its own scheduler.

New in version 3.2.

got from Can you race condition in Python while there is a GIL?

as explained in post below https://stackoverflow.com/a/74380851/9877065

ADDENDUM

following discussion about answer below : https://stackoverflow.com/a/74380851/9877065

I added 2 threads to the ones in the original script and seems like

result are conparable to the previous ones

##########################################################################################

# https://stackoverflow.com/questions/23855138/can-you-race-condition-in-python-while-there-is-a-gil/23855229#23855229
import sys
 
print(' sys.getswitchinterval() : ', sys.getswitchinterval())



######################################################################################



sys.setswitchinterval(0.5)
print(' sys.getswitchinterval() : ', sys.getswitchinterval())

from threading import Thread

# BIG_NUMBER = 500000000

BIG_NUMBER = 5  #messo da me se no troppo lento

count = BIG_NUMBER

def dec(n):
    global count
    for _ in range(n):
        count -= 1


results =  []

for i in range(10):
    
    BIG_NUMBER *= i+1
    
    # print(BIG_NUMBER)
    count = BIG_NUMBER
    print(BIG_NUMBER, count)
    
     t1 = Thread(target=dec, args=(BIG_NUMBER // 4,))
    t2 = Thread(target=dec, args=(BIG_NUMBER // 4,))
    t3 = Thread(target=dec, args=(BIG_NUMBER // 4,))
    t4 = Thread(target=dec, args=(BIG_NUMBER // 4,))
    t1.start()
    t2.start()
    t3.start()
    t4.start()
    t1.join()
    t2.join()
    t3.join()
    t4.join()
    # print(count) # I have been getting 0 multiple times
    
    results.append(count)
    
print(results)

typical output:

sys.getswitchinterval() :  0.005
 sys.getswitchinterval() :  0.005
5 5
10 10
30 30
120 120
600 600
3600 3600
25200 25200
201600 201600
1814400 1814400
18144000 18144000
[1, 2, 2, 0, 0, 0, 0, 40218, 1005081, 11183482]

and

 sys.getswitchinterval() :  0.005
 sys.getswitchinterval() :  0.5
5 5
10 10
30 30
120 120
600 600
3600 3600
25200 25200
201600 201600
1814400 1814400
18144000 18144000
[1, 2, 2, 0, 0, 0, 0, 0, 0, 2628116]

any comment on this will be highly appreciated

Re "_this is not an answer_". I think you've proven that for large enough values of `BIG_NUMBER`, it will be very likely that `count != 0`, which was the question indeed. You might explain it a bit better, but I think this is a valid answer. — wovano, Nov 13 '22 at 21:34
thanks a lot, I think a lot depends on the computing power. Funny though decreasing `switchinterval` [default value for me is 0.005 --> `sys.getswitchinterval() : 0.005` , the 5ms in @RolandSmith answer] I cannot increase the wrong results : i.e. != 0. Any hint on that ? — pippo1980, Nov 13 '22 at 21:42
I tried to reproduce these results with your script, but at my machine the results are all 0 (except for the first one of course). What Python interpreter did you use? Maybe interesting to include PC details as well. — wovano, Nov 14 '22 at 05:55
Think its python 3.10.4 on Debian 11 VMware VM hosted by Ubuntu 22, the guest uses just 1 of the host core. Adjust the code using exponential range ? https://stackoverflow.com/questions/11443737/how-to-generate-exponentially-increasing-range-in-python — pippo1980, Nov 14 '22 at 07:10
I did update the code a bit and let it run for about half an hour without any race conditions. I was using Python 3.10.2 on Windows 10 (multi-core processor). Not sure if the OS or the fact that you were using only 1 core made the difference... — wovano, Nov 14 '22 at 08:28
Mmh, forgot that I am running script in IDEwith own kernels with python binaries in virtualenv — pippo1980, Nov 14 '22 at 09:40
"_that I am running script in IDEwith own kernels_" That is interesting information to add to your answer. The GIL is specific to CPython, so if you're using a different interpreter, that might explain why you get different results. — wovano, Nov 14 '22 at 10:19
checked, scripts works exactly same behaviour running from terminal (IDE probably uses machine python when specifying it on preferences tab). — pippo1980, Nov 14 '22 at 13:36

Python GIL/Race conditions

3 Answers3