Debugging a python application that just sort of "hangs"

Question

I have an event-driven application, written in python. After a while (usually >1 week) it appears to just stop responding to events. When this happens, I just ctrl-C and re-run and all is well-again. However, it's kind of annoying that this keeps happening and I have no idea what's causing it. Is there a way I can run my application that when this occurs and the application is no longer accepting connections, I can drop into a debugger and see what it's doing and why it's not taking connections?

I've used pdb before, but the way I've used it (if condition: pdb.set_trace()) doesn't really apply here, because I have no idea what it's doing in the code when it fails. My ideal situation would be instead of Ctrl-C maybe I hit Ctrl-somethingelse and that causes it to stop and drop into the debugger. Is such a thing easily done?

You could have a signal handler set a flag, then do `if flag: pdb.set_trace()`. Whether this would work depends on where the freeze is. — Tom Hunt, Feb 20 '15 at 20:46
How big is the application? How many lines of code we talking about? — Vor, Feb 20 '15 at 21:04
Pretty small -- about 800 lines of my code, sitting on top of a 600 line websockets library. TomHunt: thanks, I will try that! — Mala, Feb 20 '15 at 23:31

score 3 · Accepted Answer · edited May 23 '17 at 12:06

Triggering pdb in your case is probably not simple. However, whenever I need to debug such hangs, I inspect a "snapshot" of tracebacks of all the threads in the process, using the dumpstacks() function.

You can either use a timer to call it periodically and print the output to a log file, and refer to it when you notice the hanging, or harness some RPC mechanism (e.g. signals) to trigger the function call in your process on demand. I usually do the latter, because the processes in my system already listen to such RPC requests (using rpyc).

Debugging a python application that just sort of "hangs"

1 Answers1

Linked