2

If a deadlock between python threads is suspected in run-time, is there any way to resolve it without killing the entire process?

For example, if a few threads take far longer than they should, a resource manager might suspect that some of them are deadlocked. While of course it should be debugged fixed in the code in the future, is there a clean solution that can be used immediately (in run-time) to perhaps kill specific threads so that the others can resume?

Edit: I was thinking to add some "deadlock detection" loop (in its own thread) that sleeps for a bit, then checks all running threads, and if a few of them look suspiciously slow, it kills the least important one among them. At what point the thread is suspected of deadlocking, and which is the least important of them, is of course defined by the deadlock detection loop programmer.

Clearly, it won't catch all problems (most obviously if the deadlock detection thread itself is deadlocked). The idea is not to find a mathematically perfect solution (which is, of course, not to write code that can deadlock). Rather, I wanted to partially solve the problem in some realistic cases.

max
  • 49,282
  • 56
  • 208
  • 355
  • 1
    Write code that can't be deadlocked. – Fredrik Sep 18 '16 at 20:12
  • @Fredrik of course, I completely agree. But it's hard to achieve that quickly in a large codebase written by multiple people that is suspected to deadlock sometimes. It may be worthwhile to combine reading/fixing/redesigning the code over time, with some immediate automatic deadlock resolution that may improve the behavior of the application. – max Sep 18 '16 at 20:15
  • What is your strategy if say the main thread is deadlocked? – Natecat Sep 18 '16 at 20:18
  • @Natecat I tried to answer your question in my edit. – max Sep 18 '16 at 21:13

2 Answers2

2

You may try using this code snippet ahead of time but during execution the program is stuck and you can't do much about it.

Debuggers like WinDbg or strace might help but as Python is an interpreted language I doubt they'll be realistic to use.

Bharel
  • 23,672
  • 5
  • 40
  • 80
  • This is awesome, better than I expected possible as far as detection. But can anything be done to "un-deadlock" the threads, say by killing them? – max Sep 18 '16 at 21:15
  • @max You may kill the threads by using undocumented and external API calls. **It has a high chance to make the memory go "Kaboom" and corrupt the whole stack**. See [this](http://stackoverflow.com/a/323993/1658617). You may as well just write `2/0` all over your code and `exec()` it. Will be safer. – Bharel Sep 18 '16 at 21:23
  • According to that link, it doesn't seem it can corrupt the stack, it just will leave a bunch of uncollectable garbage on the heap. The bigger problem is killing a thread won't automatically release the locks it holds (whether GIL or locks on individual resources acquired through python `threading` API). So the entire point of killing the thread is defeated in my case. – max Sep 18 '16 at 23:36
0

There are 2 approaches to deadlocks detecting. The first one is static analysis of the code. This way is the most preferred, but we bordered in it by the Halting Problem. It's possible to find some potentially deadlocks only in several certain cases, not in general. The second approach - tracking of locks in runtime using the Wait-for Graph. It is the most reliable, but more expensive way, because parallelism is broken when you checking the graph.

For the second way i had written a library implementing lock checking of the graph before of taking. If taking of the lock results to deadlock, exception will be raised.

You can download it by pip:

$ pip install locklib

And use it as a usual lock from the standard library:

from threading import Thread
from locklib import SmartLock


lock_1 = SmartLock()
lock_2 = SmartLock()

def function_1():
  while True:
    with lock_1:
      with lock_2:
        pass

def function_2():
  while True:
    with lock_2:
      with lock_1:
        pass

thread_1 = Thread(target=function_1)
thread_2 = Thread(target=function_2)
thread_1.start()
thread_2.start()

In this example of code you can look at potentially deadlock situation, but the thread that locking second raises an exception. Deadlock is impossible in this case.

Evgeniy Blinov
  • 351
  • 3
  • 3