0

I'm fairly new to threading, and I am trying to adapt some code to make it threadsafe.

My problem is that multiple threads access multiple methods of a single instance at the same time. Because the methods both use and change the instances state, bad things obviously happen...

I want each instance to be accessed by only 1 thread at a time (i.e. only one method of that instance running at any time and any other threads should wait).

As a note, the threads are spawned from a Dash app, so I include that keyword in case others end up here, but I don't think it makes a difference to the problem overall.

I'll start with a very simple case and then add in the complexity that I want to achieve (although it may not affect the solution).

Simple Case

Let's say I have a class like this:

class BaseClass:
    pass


class Simple(BaseClass):
    def __init__(self, a=1):
        self.a = a
        self.b = None

    def print(self):
        print(f'a = {self.a}, b = {self.b}')
        time.sleep(1)
        self.b = 'b_set'

I want to be able to run the following tests:

if __name__ == '__main__':
    test1 = Simple(a=1)
    test2 = Simple(a=2)

    t1 = threading.Thread(target=test1.print)
    t2 = threading.Thread(target=test2.print)
    t3 = threading.Thread(target=test1.print)  # Another call to test1.print

    threads = (t1, t2, t3)

    print('Starting all threads')
    start_time = time.time()
    for t in threads:
        t.start()

    for t in threads:
        t.join()

    print(f'Aiming for 2s of execution, took {time.time()-start_time:.1f}s')

I am aiming to see this output:

Starting all threads
a = 1, b = None
a = 2, b = None
a = 1, b = b_set
Aiming for 2s of execution, took 2.0s

but what I actually see is:

Starting all threads
a = 1, b = None
a = 2, b = None
a = 1, b = None
Aiming for 2s of execution, took 1.0s

Where crucially the execution time SHOULD be 2.0s (where it is currently 1.0s) because I want t1 and t2 to run concurrently and then I want t3 to run (i.e. t3 should not be running at the same time as t1).

I want to modify BaseClass so that this works.

I found this (Synchronizing All Methods in an Object) which I believe is a solution, however, it is written for Python 2, and I believe there is a much cleaner solution possible in Python 3 based on this (How to synchronize all methods in class python3?). The second solution is very close, it does make the Simple class threadsafe, but it also prevents t1 and t2 from running at the same time because it applies to the whole subclass, not the instances of the subclass. I don't understand the ins and outs of __init_subclass__ well enough to know how to modify this behaviour in a nice way.

Any help would be greatly appreciated!

More complex examples to further illustrate intention and more difficult cases

class BaseClass(abc.ABC):
    def __init__(self, a=1):
        self.a = a

    def print(self):
        print(f'a = {self.a}')
        time.sleep(1)

    def print_get(self):
        print(f'get_a = {self.get_a()}')
        time.sleep(1)

    def get_a(self):
        """An example of some methods having to call other methods of self"""
        return self.a

    @abc.abstractmethod
    def reentrant_print(self, i=0):
        """Something which has to reenter itself (or other methods of self)"""
        pass


class SubClass(BaseClass):
    def reentrant_print(self, i=0):
        """Should print three times in a row"""
        print(f'a = {self.a}: i = {i}\n')
        time.sleep(0.5)
        if i < 3:
            self.reentrant_print(i+1)


if __name__ == '__main__':
    test1 = SubClass(a=1)
    test2 = SubClass(a=2)

    methods = ('print', 'print_get', 'reentrant_print')

    for method in methods:
        print(f'\n\nStarting tests for method = {method}')
        t1 = threading.Thread(target=getattr(test1, method))
        t2 = threading.Thread(target=getattr(test2, method))
        t3 = threading.Thread(target=getattr(test1, method))  # Another call to test1
        t4 = threading.Thread(target=test1.print)  # Another call to test1 on a different method

        threads = (t1, t2, t3, t4)

        print('Starting all threads')
        start_time = time.time()
        for t in threads:
            t.start()

        for t in threads:
            t.join()

        print(f'All threads finished in {time.time()-start_time:.1f}s')

Aiming for Output to be something like:

Starting tests for method = print
Starting all threads
a = 1
a = 2
a = 1
a = 1
All threads finished in 3.0s     <<<<< Note: Should be 3.0s because it should run t1 and t2, then t3, then t4


Starting tests for method = print_get
Starting all threads
get_a = 1
get_a = 2
get_a = 1
a = 1
All threads finished in 3.0s      <<<<< Note: Should be 3.0s because it should run t1 and t2, then t3, then t4 (or could allow t4 to run at same time as t1 and t2)


Starting tests for method = reentrant_print
Starting all threads
a = 1: i = 0
a = 2: i = 0
a = 1: i = 1
a = 2: i = 1
a = 1: i = 2
a = 2: i = 2
a = 1: i = 3
a = 2: i = 3
a = 1: i = 0     <<< Note: From ~here it should be only t3 running
a = 1: i = 1
a = 1: i = 2
a = 1: i = 3

All threads finished in 4.0s    <<<<< Note: Should be 4.0s because it should run t1 and t2, then t3, then t4

Even just shedding some light on the Simple Case would be very helpful, although I include the more complex cases in case it is easy to extend behaviour if you know what you're doing!

Tim Child
  • 339
  • 1
  • 11
  • 1
    Are you aware that global variables are discouraged in Dash? It's due to the backend being stateless, which is a key design principle of Dash. I ask because i suspect that even if you get your examples to work, your classes might not be applicable in a Dash context. https://dash.plotly.com/sharing-data-between-callbacks – emher Jan 30 '21 at 13:09
  • @emher, thank you for the comment. I had read something about that a while ago, and I think in general you are absolutely right, but I think I have an unusual use case which means that I can get away with a stateful backend. I am basically using Dash as a way to interact with and analyse data for my research, so I only ever need a state for one user, myself. For others reaching this question/comment, you are right though. – Tim Child Feb 02 '21 at 04:56

0 Answers0