22

In python, should with-statements be used inside a generator? To be clear, I am not asking about using a decorator to create a context manager from a generator function. I am asking whether there is an inherent issue using a with-statement as a context manager inside a generator as it will catch StopIteration and GeneratorExit exceptions in at least some cases. Two examples follow.

A good example of the issue is raised by Beazley's example (page 106). I have modified it to use a with statement so that the files are explicitly closed after the yield in the opener method. I have also added two ways that an exception can be thrown while iterating the results.

import os
import fnmatch

def find_files(topdir, pattern):
    for path, dirname, filelist in os.walk(topdir):
        for name in filelist:
            if fnmatch.fnmatch(name, pattern):
                yield os.path.join(path,name)
def opener(filenames):
    f = None
    for name in filenames:
        print "F before open: '%s'" % f
        #f = open(name,'r')
        with open(name,'r') as f:
            print "Fname: %s, F#: %d" % (name, f.fileno())
            yield f
            print "F after yield: '%s'" % f
def cat(filelist):
    for i,f in enumerate(filelist):
        if i ==20:
            # Cause and exception
            f.write('foobar')
        for line in f:
            yield line
def grep(pattern,lines):
    for line in lines:
        if pattern in line:
            yield line

pylogs = find_files("/var/log","*.log*")
files = opener(pylogs)
lines = cat(files)
pylines = grep("python", lines)
i = 0
for line in pylines:
    i +=1
    if i == 10:
        raise RuntimeError("You're hosed!")

print 'Counted %d lines\n' % i

In this example, the context manager successfully closes the files in the opener function. When an exception is raised, I see the trace back from the exception, but the generator stops silently. If the with-statement catches the exception why doesn't the generator continue?

When I define my own context managers for use inside a generator. I get runtime errors saying that I have ignored a GeneratorExit. For example:

class CManager(object):  
    def __enter__(self):
          print "  __enter__"
          return self
    def __exit__(self, exctype, value, tb):
        print "  __exit__; excptype: '%s'; value: '%s'" % (exctype, value)
        return True

def foo(n):
    for i in xrange(n):
        with CManager() as cman:
            cman.val = i
            yield cman
# Case1 
for item in foo(10):
    print 'Pass - val: %d' % item.val
# Case2
for item in foo(10):
    print 'Fail - val: %d' % item.val
    item.not_an_attribute

This little demo works fine in case1 with no exceptions raised, but fails in case2 where an attribute error is raised. Here I see a RuntimeException raised because the with statement has caught and ignored a GeneratorExit exception.

Can someone help clarify the rules for this tricky use case? I suspect it is something I am doing, or not doing in my __exit__ method. I tried adding code to re-raise GeneratorExit, but that did not help.

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
David
  • 1,391
  • 11
  • 22

2 Answers2

10

from the Data model entry for object.__exit__

If an exception is supplied, and the method wishes to suppress the exception (i.e., prevent it from being propagated), it should return a true value. Otherwise, the exception will be processed normally upon exit from this method.

In your __exit__ function, you're returning True which will suppress all exceptions. If you change it to return False, the exceptions will continue to be raised as normal (with the only difference being that you guarantee that your __exit__ function gets called and you can make sure to clean up after yourself)

For example, changing the code to:

def __exit__(self, exctype, value, tb):
    print "  __exit__; excptype: '%s'; value: '%s'" % (exctype, value)
    if exctype is GeneratorExit:
        return False
    return True

allows you to do the right thing and not suppress the GeneratorExit. Now you only see the attribute error. Maybe the rule of thumb should be the same as with any Exception handling -- only intercept Exceptions if you know how to handle them. Having an __exit__ return True is on par (maybe slightly worse!) than having a bare except:

try:
   something()
except: #Uh-Oh
   pass

Note that when the AttributeError is raised (and not caught), I believe that causes the reference count on your generator object to drop to 0 which then triggers a GeneratorExit exception within the generator so that it can clean itself up. Using my __exit__, play around with the following two cases and hopefully you'll see what I mean:

try:
    for item in foo(10):
        print 'Fail - val: %d' % item.val
        item.not_an_attribute
except AttributeError:
    pass

print "Here"  #No reference to the generator left.  
              #Should see __exit__ before "Here"

and

g = foo(10)
try:
    for item in g:
        print 'Fail - val: %d' % item.val
        item.not_an_attribute
except AttributeError:
    pass

print "Here"
b = g  #keep a reference to prevent the reference counter from cleaning this up.
       #Now we see __exit__ *after* "Here"
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • @mgilson Thanks for your great answer. It seems that the generator function is not what catches the attribute error. That is the behavior that I wanted, but it seems it is not possible. I want to use one with-statement to centralize the exception handling in a sequence of generators. – David Mar 12 '13 at 13:41
  • @David -- context managers are something that I've been working with a decent amount as of late, so they're fresh on my mind :). It was a good question. I wish all new users asked such nice first questions :). – mgilson Mar 12 '13 at 13:45
  • @lukecampbell -- I'm glad you liked it :). It was an interesting question for sure. – mgilson Mar 12 '13 at 13:46
  • @mgilson Between Beazley and and stackoverflow I have spent three years finding existing answers without ever having to ask a question. – David Mar 12 '13 at 13:54
  • With-statements are really great - even if they can't fit this use case. I did some work with @lukecampbell last summer using with-statements and redis to coordinate distributed processes: [gist](https://gist.github.com/dstuebe/5143032) – David Mar 12 '13 at 13:56
  • @David -- That's neat:). I've asked a few. Occassionally I'll have a question, and a google search doesn't show anything. Even if I have an idea how to go about solving the problem I sometimes ask in hopes that it will be helpful to someone else. – mgilson Mar 12 '13 at 13:57
1
class CManager(object):
    def __enter__(self):
          print "  __enter__"
          return self
    def __exit__(self, exctype, value, tb):
        print "  __exit__; excptype: '%s'; value: '%s'" % (exctype, value)
        if exctype is None:
            return

        # only re-raise if it's *not* the exception that was
        # passed to throw(), because __exit__() must not raise
        # an exception unless __exit__() itself failed.  But throw()
        # has to raise the exception to signal propagation, so this
        # fixes the impedance mismatch between the throw() protocol
        # and the __exit__() protocol.
        #
        if sys.exc_info()[1] is not (value or exctype()):
            raise 
John La Rooy
  • 295,403
  • 53
  • 369
  • 502