3

I'm having an issue with some of my worker threads. I've added a catchall exception statement in the thread's run method like so:

 try:
        """Runs the worker process, which is a state machine"""
        while self._set_exitcode is None :
            assert self._state in Worker.STATES
            state_methodname = "_state_%s" % self._state
            assert hasattr(self, state_methodname)
            state_method = getattr(self, state_methodname)
            self._state = state_method() # execute method for current state

        self._stop_heartbeat()
        sys.exit( self._set_exitcode )
 except:

        self.log.debug(sys.exc_info())

I read this was the defacto way to catch everything that may be causing an issue instead of using Exception, e. I've found some great little bugs thanks to this method but my problem is that the workers' are still dying and I'm not sure how to further record what's going on or troubleshoot.

Any thoughts would be greatly appreciated.

Thanks!

Jon Cage
  • 36,366
  • 38
  • 137
  • 215
deecodameeko
  • 505
  • 7
  • 18
  • What do you mean by 'instead of using Exception'? – Jon Cage May 04 '11 at 15:34
  • @Jon - He means instead of specifically catching Exception-derived objects. With a bare except, you will also catch BaseException objects not derived from Exception (example - KeyboardInterrupt). You will also catch exceptions raised with non-exception objects (yuck). – Jeremy Brown May 04 '11 at 15:47
  • Ah, fair enough. I missed the `, e` on the end of the sentence when I first read it. – Jon Cage May 04 '11 at 15:52
  • What makes you think - a thread is dying ? Maybe it just blocked or normally exited? – SanityIO May 04 '11 at 17:08
  • I've taken over a position where we have a rendering farm in which all the workers are state machines and should never die but sit idle waiting for an amimn process to assign them more work. @Turnaev – deecodameeko May 04 '11 at 20:24
  • So they're not taking up more work when you send it to them? – Jon Cage May 04 '11 at 20:26
  • just a consequence of dealing with prototype code that was still buggy. Ended up simply catching all exceptions and putting catch statements for those which could cause issues. – deecodameeko Apr 10 '12 at 19:58

2 Answers2

11

You could try examining the execution trace of your program using the trace module. For example:

% python -m trace -c -t -C ./coverage test_exit.py

Source:

import sys
import threading

class Worker(object):
    def run(self):
        try:
            sys.exit(1)
        except:
            print sys.exc_info()

threading.Thread(target=Worker().run).start()

It will dump out each line as it is executed, and you should get a coverage report in the coverage directory:

...
threading.py(482):         try:
threading.py(483):             if self.__target:
threading.py(484):                 self.__target(*self.__args, **self.__kwargs)
 --- modulename: test_exit, funcname: run
test_exit.py(7):         try:
test_exit.py(8):             sys.exit(1)
test_exit.py(9):         except:
test_exit.py(10):             print sys.exc_info()
(<type 'exceptions.SystemExit'>, SystemExit(1,), <traceback object at 0x7f23098822d8>)
threading.py(488):             del self.__target, self.__args, self.__kwargs
...
samplebias
  • 37,113
  • 6
  • 107
  • 103
1

What makes you think that some threads are exiting prematurely? Is it possible they're exiting cleanly but your logging method isn't thread-safe?

Jon Cage
  • 36,366
  • 38
  • 137
  • 215