Thread blocks in an RLock

Question

I have this implementation:

def mlock(f):
    '''Method lock. Uses a class lock to execute the method'''
    def wrapper(self, *args, **kwargs):
        with self._lock:
            res = f(self, *args, **kwargs)
        return res
    return wrapper


class Lockable(object):

    def __init__(self):
        self._lock = threading.RLock()

Which I use in several places, for example:

class Fifo(Lockable):

    '''Implementation of a Fifo. It will grow until the given maxsize; then it will drop the head to add new elements'''

    def __init__(self, maxsize, name='FIFO', data=None, inserted=0, dropped=0):
        self.maxsize = maxsize
        self.name = name
        self.inserted = inserted
        self.dropped = dropped
        self._fifo = []
        self._cnt = None
        Lockable.__init__(self)
        if data:
            for d in data:
                self.put(d)

    @mlock
    def __len__(self):
        length = len(self._fifo)
        return length

    ...

The application is quite complex, but it works well. Just to make sure, I have been doing stress tests of the running service, and I find that it sometimes (rarely) deadlocks in the mlock. I assume another thread is holding the lock and not releasing it. How can I debug this? Please note that:

it is very difficult to reproduce: I need hours of testing to deadlock
the application is running in the background
once it deadlocks, I can not interact with it anymore

I would like to know:

what thread is holding the lock?
why is it not being released? I am using a context manager to acquire the lock, so it should always be released. Where is the bug?!

What options do I have to further debug this?

I have been checking if there is any way of knowing what thread is holding an RLock, but it seems there is not API for this.

score 1 · Answer 1 · edited May 23 '17 at 11:45

I don't think there's an easy solution for this, but it can be done with some work.

Personally, I've found the following useful (albeit in C++).

Start by creating a Lockable base that uses tracks threads' interactions with it. A Lockable object will use an additional (non-recursive) lock for protecting a dictionary mapping thread ids to interactions with it:

When a thread tries to lock, it (locks and) creates an entry.
When it acquires the lock, it (locks and) modifies the entry.
When it releases the lock, it (locks and) removes the entry.

Additionally, a Lockable object will have a low-priority thread, waking up very rarely (once every several minutes), and seeing if there's indication of a deadlock (approximated by the event that a thread has been holding the lock for a long time, while at least one other thread has waited for it).

The entry for a thread should therefore include:

the operation's time
the stacktrace info leading to the operation.

The problem is that this can alter the relative timing of threads, which might cause your program to go into different execution paths than it normally does.

Here you need to get creative. You might need to also induce (random) time lapses in these (and possibly other) operations.

Thread blocks in an RLock

1 Answers1