1

i'm currently trying to unterstand threading in python and i wrote a program that ideally would have 2 threads alternating between incrementing and decrementing a global variable but no matter how i spread out the lock it inevitably becomes out of sync.

number = 0
lock = threading.Lock()
def func1():
    global number
    global lock
    while True:
        try:
            lock.acquire()
            number += 1
        finally:
            lock.release()
        print(f"number 1 is: {number}")
        time.sleep(0.1)

def func2():
    global number
    global lock
    while True:
        try:
            lock.acquire()
            number -= 1
        finally:
            lock.release()
        print(f"number 2 is: {number}")
        time.sleep(0.1)

t1 = threading.Thread(target=func1)
t1.start()

t2 = threading.Thread(target=func2)
t2.start()

t1.join()
t2.join()

the output should look something like this:

number 1 is: 1
number 2 is: 0
number 1 is: 1
number 2 is: 0
number 1 is: 1
number 2 is: 0
number 1 is: 1
number 2 is: 0

but right now it looks like this:

number 1 is: 1
number 2 is: 0
number 1 is: 1
number 2 is: 0
number 2 is: -1number 1 is: 0

number 2 is: -1number 1 is: 0

number 1 is: 1number 2 is: 0

any idea how to do this without falling out of sync?

electricnapkin
  • 115
  • 2
  • 7
  • FYI, you don't need `global lock` here. `global` is only needed when you are assigning a new value to the name. And they're never going to alternate perfectly, because you can't predict how long each thread will get until it has to release the CPU. – Tim Roberts Aug 18 '22 at 22:22
  • In practice, neither thread will sleep for _exactly_ `0.1` seconds, so they'll eventually drift. It's not realistic to expect them to perfectly alternate forever. – Kache Aug 18 '22 at 22:30
  • 1
    Separate advice: multi-threaded communication is often simpler when done purely through messaging and ADTs like [`queue`](https://docs.python.org/3/library/queue.html). – Kache Aug 18 '22 at 22:33

3 Answers3

1

First, avoid using global variables with threads in python. Use a queue to share the variables instead.

Second, the lock acquisition in non-deterministic. At the moment a lock is released, you have no guarantee that the other thread will grab it. There is always a certain probability that the thread that just released the lock can grab it again before the other thread.

But in your case, you can avoid problems because you know the state that the variable needs to be to accept modifications by one thread or the other. So, you can enforce the protection for modification by verifying if the variable is in the right state to accept a modification.

Something like:

from threading import Thread
import time
from queue import Queue

def func1(threadname, q):
    while True:
        number = q.get()
        
        if number == 0:
            number += 1
            print(f"number 1 is: {number}")

        q.put(number)
        time.sleep(0.1)

def func2(threadname, q):
    while True:
        number = q.get()

        if number == 1:
            number -= 1
            print(f"number 2 is: {number}")

        q.put(number)
        time.sleep(0.1)

queue = Queue()
queue.put(0)
t1 = Thread(target=func1, args=("Thread-1", queue))
t2 = Thread(target=func2, args=("Thread-2", queue))

t1.start()
t2.start()
t1.join()
t2.join()
joaopfg
  • 1,227
  • 2
  • 9
  • 18
  • Polling is a bad idea. Especially when primitives for proper synchronization are easily available. –  Aug 18 '22 at 23:19
  • @Paul I think it depends on the use case. If you know the state that the variable needs to be to accept modifications by one thread or the other (like in this case) it can be an optimization letting it be accessible randomly by the two threads. – joaopfg Aug 19 '22 at 07:23
  • Keep in mind that your threads will quite regularly wake up just to find out that they should've remained asleep and go back to sleep. Also we're dealing with python here, so this is most likely not a considerable optimization anyways. And actually this code works correctly purely by chance - though with a greater likelihood than OPs code. There's nothing stopping the threads of interlacing their operations in such a way that a "wrong" result is printed. –  Aug 19 '22 at 13:33
  • @Paul I think it depends if the total time they spent waking up to find out they should've remained asleep is greater than the total time taken by the lock operations. We need to measure it to be sure. Also, I don't get why you say this works by chance. At what point can it go wrong exactly ? – joaopfg Aug 19 '22 at 13:45
  • sorry, my bad. Actually the code works fine. I missed the fine detail that pythons queue-implementation is synchronized. This basically makes your entire argument void though, because you're actually using locks anyways. They're simply hidden away in your queue instead of being openly visible. And you can still encounter the variable in an invalid state. So you're definitely worse off than simply using a lock. –  Aug 19 '22 at 14:43
  • @Paul Indeed, they use a lock internally. But you are using two locks. So, my argument remains. Also, isn't the code protecting itself against invalid states ? I don't see a case where an invalid state can be prejudicial here. Can you elaborate more on that, please ? – joaopfg Aug 19 '22 at 14:48
  • the problem about locks is the overhead of acquiring and releasing them, not about keeping them around. So whether you're using a single lock or two doesn't exactly matter. "Invalid state" was probably a bad choice of word. It can happen that the same thread accesses the variable twice without the other thread having a chance of modifying it in between. In that case your thread blocks the other thread and wastes resources on lock-acquisition without doing any work. –  Aug 19 '22 at 17:18
  • @Paul I think there is also an overhead for keeping them around. Each time a thread wants to act, it needs to access a certain location in memory to verify if one of the locks is released and then access another location in memory to release the other lock. Here, since there is only one lock, there is only one location in memory that is accessed. Only experimental results would allow to check what is faster. There are too many factors. Also, although the += and -= operations are not atomic, they are protected by the if clause. So, a thread never gets a corrupted variable. It's not luck. – joaopfg Aug 19 '22 at 20:42
  • That's hardly relevant. Getting a few bytes for a lock is certainly cheaper than context-switches between threads. Your if-clause doesn't protect anything. The only reason why your code works is because the queue blocks your threads from executing anything concurrently. Use a normal list instead of a synchronized container and watch your code fall apart within a few iterations. –  Aug 19 '22 at 21:05
  • @Paul I'm not talking about the memory occupied by the locks. I'm talking about the time to access them in different locations of memory and checking if they are released. Also, the if-clause does protect stuff. Remove it and see the code fall apart within a few iterations (even with a queue). – joaopfg Aug 19 '22 at 21:09
  • @Paul But I have to agree that your pattern with two locks is maybe unavoidable in certain cases. – joaopfg Aug 19 '22 at 21:14
  • It doesn't protect from any errors due to concurrent execution is what I meant. It obviously does prevent the variable from being modified incorrectly, if the critical section is protected to prevent concurrent execution, like in your code. If that protection is dropped however - like if your code was truly lockless - the if-clause won't be any help at all, since execution of the two threads can interlace in quite nasty patterns. I do realize this is about time required to process the locks. But thread-switching is far more effort than checking two locks. –  Aug 19 '22 at 21:48
  • @Paul I think it's the same time for thread-switching and checking two locks. Both operations are done with pointers to the memory and not with the actual memory content (at least in cpython). – joaopfg Aug 19 '22 at 22:02
  • @Paul I guess what really matters here is the frequency with which locks are checked or threads are switched. Only an experiment can tell what is faster. As I said before, there are too many things to take into account. – joaopfg Aug 19 '22 at 22:12
1

thanks for all your answers, i remember seing someone in the comments mentioned using events or something like that and that solved the issue. here's the code:

number = 0
event_number = threading.Event()
event_number.clear()

def func1():
    global number
    global event_number
    while True:
        if not event_number.is_set():
            number += 1
            print(f"func 1 is {number}")
            event_number.set()
        else:
            pass
        time.sleep(2)

def func2():
    global number
    global event_number
    while True:
        if event_number.is_set():
            number -= 1
            print(f"func 2 is {number}")
            event_number.clear()
        else:
            pass
        time.sleep(2)

t1 = threading.Thread(target=func1)
t2 = threading.Thread(target=func2)

t1.start()
t2.start()

t1.join()
t2.join()

now i notice that sometimes one of the loops will either not wait it's alloted time and print right away or wait double the time but at least the number only stays within those 2 values.

electricnapkin
  • 115
  • 2
  • 7
0

For starters, time.sleep is not exactly accurate. And depending on the python-implementation you're using (most likely cpython) multithreading might not quite work the way you're expecting it to. These two factors allow the initially correct timing of your threads to get out of sync within fairly short time.

There solution for this problem is to enforce alternate operation on the variable by the two threads via two locks:

import time
import threading

var = 0


def runner(op, waitfor, release):
    global var

    while True:
        try:
            # wait for resource to free up
            waitfor.acquire()

            # operation
            var = op(var)
            print(f"var={var}")
        finally:
            # notify other thread
            release.release()

        time.sleep(0.1)


# init locks for thread-synchronization
lock_a = threading.Lock()
lock_b = threading.Lock()
lock_a.acquire()
lock_b.acquire()

# create and start threads (they'll wait for their lock to be freed)
thread_a = threading.Thread(target=runner, args=(lambda v: v - 1, lock_a, lock_b))
thread_b = threading.Thread(target=runner, args=(lambda v: v + 1, lock_b, lock_a))
thread_a.start()
thread_b.start()

# let thread_b start the first operation by releasing the lock
lock_b.release()

In the above code, each thread has a lock that can be used to notify it, that the resource may be used by it. Thus threads can hand control over the global variable to each other.