2

I stumbled upon a question yesterday that involved enumerating over an iterable type and yeilding descending indices accompanied by ascending items in the iterable.

In:

letters = ['a', 'b', 'c']
for i, letter in revenumerate(letters):
    print('{}, {}'.format(i, letter))

Out:

2, a
1, b
0, c 

Instead of writing a quick and reliable answer applying built-in reverse twice, or simply i = len(letters) - i - 1, I decided to try create a child class of enumerate that redefines __iter__ and __next__ methods.

The code for my original working solution was as follows:

class revenumerate(enumerate):
    def __init__(self, iterable, start=0):
        self._len = len(iterable)
        self._start = start
        if isinstance(iterable, dict):
            self._data = iterable.keys()
        else:
            self._data = iterable

    def __iter__(self):
        _i = self._len
        for _item in self._data:
            _i -= 1
            _ind = _i + self._start
            yield _ind, _item

    def __next__(self):
        _i, _item = super().__next__()
        _ind = self._len +  2 * self._start - _i - 1
        return _ind, _item

However, I now realize this code has redundancy as enumerate.__iter__ appears to yield the result of __next__, makes sense. After deleting the redefined __iter__ I realized that self._data was not being used anywhere, so I removed the last four lines from __init__ leaving me with the following code, which still provides the desired behavior.

class revenumerate(enumerate):
    def __init__(self, iterable, start=0):
        self._len = len(iterable)
        self._start = start

    def __next__(self):
        _i, _item = super().__next__()
        _ind = self._len +  2 * self._start - _i - 1
        return _ind, _item

Now it appears that the iterable argument passed into revenumerate is not for anything except determining the integer self._len.

My question is - where is iterable stored and how does super().__next__ access it?

A quick look at builtins.py with the PyCharm debugger does not provide a lot of help in figuring this out (or so it seems to me at this stage), and I am not well traversed with the Python source code repository. My guess is something to do with the __new__ or __init__ method of parent class enumerate, or it's parent object.

4 Answers4

4

builtins.py is a lie. PyCharm made it up. If you want to look at the real source code for the builtins module, that's Python/bltinmodule.c in the Python Git repository. enumerate itself is implemented in Objects/enumobject.c.

enumerate iterators store an iterator over their underlying object in a C-level en_sit struct slot:

typedef struct {
    PyObject_HEAD
    Py_ssize_t en_index;           /* current index of enumeration */
    PyObject* en_sit;          /* secondary iterator of enumeration */
    PyObject* en_result;           /* result tuple  */
    PyObject* en_longindex;        /* index for sequences >= PY_SSIZE_T_MAX */
} enumobject;

set in enumerate.__new__:

static PyObject *
enum_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
    ...
    en->en_sit = PyObject_GetIter(seq);

The fact that it's set in __new__ is why it still worked even though you forgot to call super().__init__.


Subclassing enumerate for this doesn't make a lot of sense. enumerate is only documented as a callable; the fact that it's a class and supports subclassing is an implementation detail. Also, you're not getting a lot of use out of enumerate, and the relationship between your iterators and enumerate iterators doesn't really sound like "is-a". Implementing your functionality as a generator, like zvone did, is cleaner and clearer.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • Thank you for the concise explanation and direction. I am not familiar with the Python source code repository on Github, so this is exactly what I was after. – Campbell McDiarmid May 18 '18 at 02:51
3

What enumerate does is more-or-less* this:

def enumerate(iterable):
    counter = 0
    for item in iterable:
        counter += 1
        yield counter, item

One thing you can notice is that it does not know how long the iterable is. It can even be infinitely long, but enumerate will still work.

The problem with revenumerate is that you first have to count how many items there are before being able to yield the first one, so you actually have to create a list of all enumerated items and then yield them backwards (at least if you want your revenumerate to work with any iterable, like enumerate).

Once you accept that limitation as inavoidable, the rest is simple:

def revenumerate(iterable):
    all_items = list(iterable)
    counter = len(all_items)
    for item in reversed(all_items):
        counter -= 1
        yield counter, item

(*) enumerate is actually a class, but this is its behaviour. See my other answer about how that works and what __next__ does.

zvone
  • 18,045
  • 3
  • 49
  • 77
1

In my previous answer I wrote how I would do it, but here are some answers to what was actually asked about __iter__ and __next__...

Iterable

In order for an object to be iterable, it has to implement method __iter__, which has to return an iterator.

Here are some simple examples:

class A:
    def __iter__(self):
        return iter([1, 2, 3])

class B:
    def __iter__(self):
        yield 'a'
        yield 'b'

These can be iterated:

>>> A().__iter__()
<list_iterator object at 0x00000000029EFD30>

>>> iter(A())  # calls A().__iter__()
<list_iterator object at 0x00000000029EFF28>

>>> list(A())  # calls iter(A) and iterates over it
[1, 2, 3]

>>> list(B())  # calls iter(B) and iterates over it
['a', 'b']

Iterator

The object returned from __iter__ is an iterator. An iterator must implement the __next__ method.

For example:

>>> it = iter(B())  # iterator

>>> it.__next__()
'a'

>>> next(it)  # calls it.__next__()
'b'

>>> next(it)  # raises StopIteration because there is nothing more
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Custom iterator

class MyIterator:
    def __init__(self):
        self.value = 5
    def __next__(self):
        if self.value > 0:
            self.value -= 1
            return self.value
        else:
            raise StopIteration()

class MyIterable:
    def __iter__(self):
        return MyIterator()

>>> list(MyIterable())
[4, 3, 2, 1, 0]

EDIT: As others mentioned in the comments, an iterator should always implement __iter__ which returns self (as I did in the examples below). This requirement can be read in PEP-0234 and in Python docs:

A class that wants to be an iterator should implement two methods: a next() method that behaves as described above, and an __iter__() method that returns self.

Iterable iterator

An iterable iterator? Well if a class implements both __iter__ and __next__, then it is both:

class IterableIterator:
    def __init__(self):
        self.value = 11

    def __next__(self):
        if self.value < 17:
            self.value += 1
            return self.value
        else:
            raise StopIteration()

    def __iter__(self):
        return self

>>> list(IterableIterator())
[12, 13, 14, 15, 16, 17]

enumerate

enumerate actually does something like this:

class enumerate:
    def __init__(self, iterable, start=0):
        self.iterator = iter(iterable)
        self.n = start - 1

    def __iter__(self):
        return self

    def __next__(self):
        self.n += 1
        next_item = next(self.iterator)
        return self.n, next_item

So, to answer your question, in your super().__next__(), you are calling this __next__ here, which uses the iterator which it stored in the constructor.

zvone
  • 18,045
  • 3
  • 49
  • 77
  • 1
    It's worth noting that all iterators are supposed to be iterables too, they should always have an `__iter__` method that returns `self`. Without that behavior, you wouldn't be able use an iterator you already had created in most places (not in a loop, nor by passing it to a function or class constructor that expects an iterable, like `list`). Those places all call `iter` on the object they're iterating on, so you need an `__iter__` method, even if it's a trivial one. – Blckknght May 17 '18 at 23:50
  • Iterators are [supposed to](https://docs.python.org/3/library/stdtypes.html#iterator.__iter__) have an `__iter__` that returns `self`, and any iterator that doesn't do so is violating the iterator protocol. This usually isn't checked, which results in broken iterators that mostly work until they end up in some code that actually depends on this part of the iterator protocol. – user2357112 May 17 '18 at 23:56
  • @user2357112 Thanks, I added that to the answer. – zvone May 18 '18 at 05:48
1

Others have answered your specific question about how your code works so here's another way to implement a reverse enumerator using zip():

def revenumerate(iterable, start=None):
    if start is None:
        start = len(iterable) - 1
    return zip(range(start, -1, -1), iterable)

>>> revenumerate('abcdefg')
<zip object at 0x7f9a5746ec48>
>>> list(revenumerate('abcdefg'))
[(6, 'a'), (5, 'b'), (4, 'c'), (3, 'd'), (2, 'e'), (1, 'f'), (0, 'g')]
>>> list(revenumerate('abcdefg', 100))
[(100, 'a'), (99, 'b'), (98, 'c'), (97, 'd'), (96, 'e'), (95, 'f'), (94, 'g')]

revenumerate() returns a zip object that is very similar to the enumerate object returned by enumerate().

By default the items will be enumerated starting at the length of the iterable less one, which requires that the length be finite. You can supply a start value from which to count down which would be useful if you just wanted to start counting from an arbitrary value, or to sort of handle infinite iterables.

>>> from itertools import count
>>> g = revenumerate(count(), 1000)
>>> next(g)
(1000, 0)
>>> next(g)
(999, 1)
>>> next(g)
(998, 2)
>>> next(g)
(997, 3)
>>> next(g)
(996, 4)

If you tried to work on an infinite iterable without specifying the start value:

>>>> revenumerate(count())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in revenumerate
TypeError: object of type 'itertools.count' has no len()

Which prevents the interpreter entering an infinite loop. You could handle the exception and raise one of your own if that suited your application.

mhawke
  • 84,695
  • 9
  • 117
  • 138
  • Thanks for the input! I asked this question more because I didn't understand why my code worked and didn't know my way around the Python source code on Github, but I really appreciate seeing new and unique approaches to the problem! – Campbell McDiarmid May 18 '18 at 02:48