32

As a complete Python newbie, it certainly looks that way. Running the following...

x = enumerate(['fee', 'fie', 'foe'])
x.next()
# Out[1]: (0, 'fee')

list(x)
# Out[2]: [(1, 'fie'), (2, 'foe')]

list(x)
# Out[3]: []

... I notice that: (a) x does have a next method, as seems to be required for generators, and (b) x can only be iterated over once, a characteristic of generators emphasized in this famous python-tag answer.

On the other hand, the two most highly-upvoted answers to this question about how to determine whether an object is a generator would seem to indicate that enumerate() does not return a generator.

import types
import inspect

x = enumerate(['fee', 'fie', 'foe'])

isinstance(x, types.GeneratorType)
# Out[4]: False

inspect.isgenerator(x)
# Out[5]: False

... while a third poorly-upvoted answer to that question would seem to indicate that enumerate() does in fact return a generator:

def isgenerator(iterable):
    return hasattr(iterable,'__iter__') and not hasattr(iterable,'__len__')

isgenerator(x)
# Out[8]: True

So what's going on? Is x a generator or not? Is it in some sense "generator-like", but not an actual generator? Does Python's use of duck-typing mean that the test outlined in the final code block above is actually the best one?

Rather than continue to write down the possibilities running through my head, I'll just throw this out to those of you who will immediately know the answer.

Community
  • 1
  • 1
Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • 7
    does it quack like a duck? – 6502 May 14 '14 at 19:25
  • 1
    kinda similar to how `xrange()` is not a `GeneratorType` either, but it sure behaves like a generator – acushner May 14 '14 at 19:28
  • 4
    This might be useful: http://stackoverflow.com/questions/2776829/difference-between-python-generators-vs-iterators -- it looks like Python distinguishes between iterators and generators – Michael0x2a May 14 '14 at 19:29
  • 2
    "As a complete Python newbie" it certainly looks like you're overly concerned with exact types. ;) – John Y May 14 '14 at 19:49

5 Answers5

35

While the Python documentation says that enumerate is functionally equivalent to:

def enumerate(sequence, start=0):
    n = start
    for elem in sequence:
        yield n, elem
        n += 1

The real enumerate function returns an iterator, but not an actual generator. You can see this if you call help(x) after doing creating an enumerate object:

>>> x = enumerate([1,2])
>>> help(x)
class enumerate(object)
 |  enumerate(iterable[, start]) -> iterator for index, value of iterable
 |  
 |  Return an enumerate object.  iterable must be another object that supports
 |  iteration.  The enumerate object yields pairs containing a count (from
 |  start, which defaults to zero) and a value yielded by the iterable argument.
 |  enumerate is useful for obtaining an indexed list:
 |      (0, seq[0]), (1, seq[1]), (2, seq[2]), ...
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(...)
 |      x.__getattribute__('name') <==> x.name
 |  
 |  __iter__(...)
 |      x.__iter__() <==> iter(x)
 |  
 |  next(...)
 |      x.next() -> the next value, or raise StopIteration
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __new__ = <built-in method __new__ of type object>
 |      T.__new__(S, ...) -> a new object with type S, a subtype of T

In Python, generators are basically a specific type of iterator that's implemented by using a yield to return data from a function. However, enumerate is actually implemented in C, not pure Python, so there's no yield involved. You can find the source here: http://hg.python.org/cpython/file/2.7/Objects/enumobject.c

dano
  • 91,354
  • 19
  • 222
  • 219
  • Very interesting. Just to be sure I've got this straight, iterators (or iterables) are defined by their behavior, whereas there are actual "generator" and "enumerate" classes that define those objects. "enumerate"-class objects share many behaviors with "generator"-class objects, but so do objects of many other classes. Is there an simple way to find out whether any one of these "generalized generator" classes generates its elements on the fly (like a true generator), rather than storing them in memory? – Josh O'Brien May 14 '14 at 19:47
  • 4
    @JoshO'Brien A little nitpick: an "iterable" is any object that can be iterated over, e.g. list, dict, str, file. An "iterator" is the object that's actually created to iterate over an iterable. For most Python containers, you get it by calling `iter(obj)`. This happens implicitly when you do `for x in obj`. Edit: I see John Y. beat me to this point :) – dano May 14 '14 at 20:02
  • 1
    @dano: Your comment is better. I just deleted mine as you were editing yours to mention mine. Here's the [glossary link](https://docs.python.org/2/glossary.html) again though, as that is useful regardless. – John Y May 14 '14 at 20:06
  • 2
    @JoshO'Brien I'm not aware of any way to determine if the data being returned by a given iterator is obtained lazily or if it's loaded completely into memory. An iterator just provides a way to iterate over an object once (and only once) one step at a time, by exposing a `next()` method. Exactly what goes on inside `next()` is unknown to the caller. – dano May 14 '14 at 20:09
  • @JohnY -- Thanks for leaving the glossary link in place. Both of your comments help a lot. Was just off comparing `iter(range(9))`, `iter(xrange(9))`, `iter([2,1])`, `iter((2,1))`, and `iter(enumerate([2,1]))`, which was quite enlightening. Still not totally sure why `enumerate` and `generator` objects get "used up" by being iterated across whereas the other types don't, but I suppose that's just a matter of their implementation and of decisions on the part of their authors that that behavior was desirable. Edit: OK -- scratch that final bit. I just figured it out. Now it all makes good sense – Josh O'Brien May 14 '14 at 20:11
  • For anyone coming here looking for the answer to same question i had: __get_item__ of object that you are iterating needs to throw IndexError when out of bounds to be able to enumerate it. I had dict lookup in my __get_item__ and it ofcourse raised KeyError but that failed the enumerate(my_obj) – ellonde Mar 04 '21 at 10:42
13

Testing for enumerate types:

I would include this important test in an exploration of the enumerate type and how it fits into the Python language:

>>> import collections
>>> e = enumerate('abc')
>>> isinstance(e, enumerate)
True
>>> isinstance(e, collections.Iterable)
True
>>> isinstance(e, collections.Iterator)
True

But we see that:

>>> import types
>>> isinstance(e, types.GeneratorType)
False

So we know that enumerate objects are not generators.

The Source:

In the source, we can see that the enumerate object (PyEnum_Type) that iteratively returns the tuple, and in the ABC module we can see that any item with a next and __iter__ method (actually, attribute) is defined to be an iterator. (__next__ in Python 3.)

The Standard Library Test

So the Abstract Base Class library uses the following test:

>>> hasattr(e, 'next') and hasattr(e, '__iter__')
True

So we know that enumerate types are iterators. But we see that a Generator type is created by a function with yield in the documentation or a generator expression. So generators are iterators, because they have the next and __iter__ methods, but not all iterators are necessarily generators (the interface which requires send, close, and throw), as we've seen with this enumerate object.

So what do we know about enumerate?

From the docs and the source, we know that enumerate returns an enumerate object, and we know by definition that it is an iterator, even if our testing states that it is explicitly not a generator.

We also know from the documentation that generator types simply "provide a convenient way to implement the iterator protocol." Therefore, generators are a subset of iterators. Furthermore, this allows us to derive the following generalization:

All generators are iterators, but not all iterators are generators.

So while we can make our enumerate object into a generator:

>>> g = (i for i in e)
>>> isinstance(g, types.GeneratorType)
True

We can't expect that it is a generator itself, so this would be the wrong test.

So What to Test?

And what this means is that you should not be testing for a generator, and you should probably use the first of the tests I provided, and not reimplement the Standard Library (which I hope I can be excused from doing today.):

If you require an enumerate type, you'll probably want to allow for iterables or iterators of tuples with integer indexes, and the following will return True:

isinstance(g, collections.Iterable)

If you only want specifically an enumerate type:

isinstance(e, enumerate)

PS In case you're interested, here's the source implementation of generators: https://github.com/python/cpython/blob/master/Objects/genobject.c
And here's the Generator Abstract Base Class (ABC): https://github.com/python/cpython/blob/master/Lib/_collections_abc.py#L309

Community
  • 1
  • 1
Russia Must Remove Putin
  • 374,368
  • 89
  • 403
  • 331
4

Is it in some sense "generator-like", but not an actual generator?

Yes, it is. You shouldn't really care if it is a duck, but only if it walks, talks, and smells like one. It just as well be a generator, shouldn't make a real difference.

It is typical to have generator-like types instead of actual generators, when you want to extend the functionality. E.g. range is also generator-like, but it also supports things like y in range(x) and len(range(x)) (xrange in python2.x).

shx2
  • 61,779
  • 13
  • 130
  • 153
3

You can try a few things out to prove to yourself that it's neither a generator nor a subclass of a generator:

>>> x = enumerate(["a","b","c"])
>>> type(x)
<type 'enumerate'>
>>> import types
>>> issubclass(type(x), types.GeneratorType)
False

As Daniel points out, it is its own type, enumerate. That type happens to be iterable. Generators are also iterable. That second, down-voted answer you reference basically just points that out somewhat indirectly by talking about the __iter__ method.

So they implement some of the same methods by virtue of both being iterable. Just like lists and generators are both iterable, but are not the same thing.

So rather than say that something of type enumerate is "generator-like", it makes more sense to simply say that both the enumerate and GeneratorType classes are iterable (along with lists, etc.). How they iterate over data (and shape of the data they store) might be quite different, but the interface is the same.

Hope that helps!

Nacho
  • 451
  • 2
  • 9
2

enumerate generates an enumerate-object. It is a iterator, like a generator.

Daniel
  • 42,087
  • 4
  • 55
  • 81