12

python generators are good replacements for lists in most cases expect where I would like to check for empty condition which is not possible with plain generators. I am trying to write a wrapper which will allow checking for empty condition but is still lazy and gives the benefit of generators.

class mygen:
  def __init__(self,iterable):
    self.iterable = (x for x in iterable)
    self.peeked = False
    self.peek = None
  def __iter__(self):
    if self.peeked:
      yield self.peek
      self.peeked = False
    for val in self.iterable:
      if self.peeked:
        yield self.peek
        self.peeked = False
      yield val
    if self.peeked:
      yield self.peek
      self.peeked = False
  def __nonzero__(self):
    if self.peeked:
      return True
    try:
      self.peek = self.iterable.next()
      self.peeked = True
      return True
    except:
      return False
  1. I think it behaves correctly like a plain generator. Is there any corner case I'm missing?
  2. This doesn't look elegant. Is there a better more pythonic way of doing the same?

Sample usage:

def get_odd(l):
    return mygen(x for x in l if x%2)

def print_odd(odd_nums):
  if odd_nums:
      print "odd numbers found",list(odd_nums)
  else:
      print "No odd numbers found"

print_odd(get_odd([2,4,6,8]))
print_odd(get_odd([2,4,6,8,7]))
balki
  • 26,394
  • 30
  • 105
  • 151

2 Answers2

12

I would not usually implement this kind of generator. There is an idiomatic way how to test if a iterator it is exhausted:

try:
    next_item = next(it)
except StopIteration:
    # exhausted, handle this case

Substituting this EAFP idiom by some project-specific LBYL idiom seems confusing and not beneficial at all.

That said, here is how I would implement this if I really wanted to:

class MyIterator(object):
    def __init__(self, iterable):
        self._iterable = iter(iterable)
        self._exhausted = False
        self._cache_next_item()
    def _cache_next_item(self):
        try:
            self._next_item = next(self._iterable)
        except StopIteration:
            self._exhausted = True
    def __iter__(self):
        return self
    def next(self):
        if self._exhausted:
            raise StopIteration
        next_item = self._next_item
        self._cache_next_item()
        return next_item
    def __nonzero__(self):
        return not self._exhausted
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 1
    I do see the point in checking for emptyness; this can be very convenient if you want to *either* loop over the elements of an iterator, *or* do something special when there are none. Still, +1 for the simple look-ahead iterator. – Fred Foo Jul 13 '12 at 09:46
  • 2
    @larsmans: I do see the point of checking for an empty iterator either, so I would use the idiomatic way of doing this. :) – Sven Marnach Jul 13 '12 at 09:48
  • 1
    I just wanted to mention that __nonzero__() becomes __bool__() in Python3, in case anyone read this – madtyn Sep 14 '17 at 10:11
4

Use itertools.tee to implement the nonzero test, and simply cache it on creation:

from itertools import tee

class NonZeroIterable(object):
    def __init__(self, iterable):
        self.__iterable, test = tee(iter(iterable))
        try:
            test.next()
            self.__nonzero = True
        except StopIteration:
            self.__nonzero = False                 

    def __nonzero__(self):
        return self.__nonzero

    def __iter__(self):
        return self.__iterable

Little demo:

>>> nz = NonZeroIterable('foobar')
>>> if nz: print list(nz)
... 
['f', 'o', 'o', 'b', 'a', 'r']
>>> nz2 = NonZeroIterable([])
>>> if not nz2: print 'empty'
... 
empty

This version of the NonZeroIterable caches the flag; it thus only tells you if the iterator was non-empty at the start. If you need to be able to test the iterable at other points in it's lifecycle, use Sven's version instead; there the __nonzero__ flag will tell you after every iteration if there are more items to come still.

Side note on your example

Your sample code is way too simple and not a good argument for your usecase; you first test for non-emptyness (which potentially iterates over the input list to seach for an odd number) but then exhaust the whole iterator anyway. The following code would be just as efficient and wouldn't require you to invent ways to break python idioms:

def print_odd(odd_nums):
    odd_nums = list(odd_nums)
    if odd_nums:
        print "odd numbers found", odd_nums
    else:
        print "No odd numbers found"
Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • This does something different from the original code. In the original code, `__nonzero__()` returns whether the iterator is exhausted or not. – Sven Marnach Jul 13 '12 at 09:29
  • @SvenMarnach: but he never uses it in that fashion. Updated with a variation that does toggle it on exhaustion. – Martijn Pieters Jul 13 '12 at 09:30
  • From my understanding, the OP's idea is that `__nonzero__` should test if the underlying generator yielded at least once, no matter if it's currently exhausted or not. So, the first snippet is correct. – georg Jul 13 '12 at 09:36
  • @SvenMarnach: See, this is why this whole thing is a bad idea in the first place.. Also, we are stretching the concept of `__nonzero__` to breaking point in any case. – Martijn Pieters Jul 13 '12 at 09:37
  • @thg435: Well, this is not what the original code does, and the post isn't very clear. Let's wait what the OP says. – Sven Marnach Jul 13 '12 at 09:38
  • My last comment (now deleted) about the second version was hard to understand, so I try again. I think the second version is nonsensical, because the `__nonzero__()` method does not have well-defined semantics. It does *neither* return whether the iterable was empty right from the beginning, *nor* whether there are still items left, but does something in between. – Sven Marnach Jul 13 '12 at 10:00
  • @SvenMarnach: unfortunately, so is the OP's intent. I'll remove the example anyway, your solution in that case is better. – Martijn Pieters Jul 13 '12 at 10:01