Proper care and safety when dealing with traceback objects from sys.exc_info()

Question

I'm aware that the sys.exc_info documentation says to take care when dealing with traceback objects, but am still uncertain of how safe or unsafe some cases are. Additionally, the documentation says "Warning: Don't do this!", followed immediately by "Note: Actually, its ok", which further confuses me.

In any case, the docs and "Why is there a need to explicitly delete the sys.exc_info() traceback in Python?" (Alex Martelli's answer), seem to imply its only local variables that reference the traceback value assigned to them that cause a problem.

This leaves me with a few questions:

What, exactly, does "local variable" mean in this context? I'm struggling for terminology, but: does this mean only variables created in the function, or also variable created by the function parameters? What about other variables in scope, e.g, globals or self?
How do closures affect the potential circular references of the traceback? The general thought being: a closure can reference everything its enclosing function can, so a traceback with a reference to a closure can end up referencing quite a bit. I'm struggling to come up with a more concrete example, but some combination of: an inner function, code that returns sys.exc_info(), with expensive-short-lived-objects in scope somewhere.

Feel free to tell me where my conclusions or assumptions are wrong, as I've reasoned myself into belief and non-belief of my own statements several times as I've written this :).

While I'd like answers to my specific examples, I'm also asking for general advice, knowledge, or war stories on how to safely deal with tracebacks in more esoteric situations (e.g, you have to run a loop and want to accumulate any raised exceptions, you have to spawn a new thread and need to report any of its raised exceptions, you have to create closures and callbacks and have to communicate back raised exceptions, etc).

Example 1: an inner function that does error handling

def DoWebRequest():
  thread, error_queue = CreateThread(ErrorRaisingFunc)
  thread.start()
  thread.join()
  if not error_queue.empty():
    # Purposefully not calling error_queue.get() for illustrative purposes
    print 'error!'

def CreateThread(func):
  error_queue = Queue.Queue()
  def Handled():
    try:
      func()
    except Exception:
      error_queue.put(sys.exc_info())
  thread = threading.Thread(target=Handled)
  return thread, error_queue

Does the Handled() closure cause any raised exception to reference error_queue and result in a circular reference because error_queue also contains the traceback? Is removing the traceback from error_queue (i.e., calling .get()) enough to eliminate the circular-reference?

Example 2: a long lived object in scope of exc_info, or returning exc_info

long_lived_cache = {}

def Alpha(key):
  expensive_object = long_lived_cache.get(key)
  if not expensive_object:
    expensive_object = ComputeExpensiveObject()
    long_lived_cache[key] = expensive_object

  exc_info = AlphaSub(expensive_object)
  if exc_info:
    print 'error!', exc_info

def AlphaSub(expensive_object):
  try:
    ErrorRaisingFunc(expensive_object)
    return None
  except Exception:
    return sys.exc_info()

Does the raised exception of AlphaSub() have a reference to expensive_object, and, because expensive_object is cached, the traceback never goes away? If so, how does one break such a cycle?

Alternatively, exc_info contains the Alpha stack frame, and the Alpha stack frame contains the reference to exc_info, resulting in a circular reference. If so, how does one break such a cycle?

score 4 · Accepted Answer · answered Sep 11 '11 at 07:27

What, exactly, does "local variable" mean in this context? I'm struggling for terminology, but: does this mean only variables created in the function, or also variable created by the function parameters? What about other variables in scope, e.g, globals or self?

"Local variables" are all name bindings created inside a function. This includes any function parameters, and any variables assigned. E.g. in the following:

def func(fruwappah, qitzy=None):
    if fruwappah:
        fruit_cake = 'plain'
    else:
        fruit_cake = qitzy
    frosting = 'orange'

the variables fruwappah, qitzy, fruit_cake, and frosting are all local. Oh, and since self is in the function header (when it is, just not in my example ;), it's also local.

How do closures affect the potential circular references of the traceback? The general thought being: a closure can reference everything its enclosing function can, so a traceback with a reference to a closure can end up referencing quite a bit. I'm struggling to come up with a more concrete example, but some combination of: an inner function, code that returns sys.exc_info(), with expensive-short-lived-objects in scope somewhere.

As the answer you linked to states: a traceback references every function (and its variables) that was active at the time the exception occurred. In other words, whether there's a closure involved or not is irrelevant -- assigning to a closure (non-local), or for that matter a global, variable will create a circular reference.

There are two basic ways to deal with this:

Define a function that will be called after the exception is raised -- it won't have a stack frame in the traceback, so when it ends all its variables-- including the traceback --will go away; or
Make sure and del traceback_object when you are done with it.

Having said all that, I have as yet to need the traceback object in my own code -- the Exception, along with its various attributes, has been sufficient so far.

Ethan, what about the situation where you want to capture exceptions raised in another thread? The paradox I run into is that the thread has to store the exception (and hence the traceback has a reference to whereever itself is stored), and it is up to the caller of the thread (or anyone besides ThreadWithException) to break the cycle. — Richard Levasseur, Sep 12 '11 at 20:34
Actually, I think I figured out a way. You should add a 3rd point: `del` any object that contains a transitive reference to the traceback. e.g, `try: func(); except: obj.exc = sys.exc_info(); del obj`. As far as I can tell from inspecting `gc.get_objects()`, the refcount drops to 0 for the objects in question (`obj`, exc_info), and breaks the cycle — Richard Levasseur, Sep 12 '11 at 20:48
er, I should clarify: from inspecting the traceback frames, there is no longer a reference to `obj`, hence `obj` references the traceback, but the traceback does not reference `obj` — Richard Levasseur, Sep 12 '11 at 20:55
The trick is to extract whatever information you need from the traceback object, then get rid of the traceback objects. BTW, your 'transitive reference' is really just another name for (in this case) the tuple returned by `sys.exc_info()` -- which is why you have to delete it as well. — Ethan Furman, Sep 12 '11 at 20:55
In my specific case, I'm trying to re-raise an exception thrown in another thread. The sub-thread communicates the return values or raised exception via a queue (much like example 1). Also, its Python 2.6. So I'm not extracting anything, just trying to preserve the stack trace. — Richard Levasseur, Sep 12 '11 at 22:01

Proper care and safety when dealing with traceback objects from sys.exc_info()

1 Answers1