Creating a non-iterator iterable

Question

I was reading What exactly are iterator, iterable, and iteration? and Build a basic Python iterator when I realized I don't understand in practice how an iterable class must be implemented.

Say that I have the following class:

class MyClass():
    def __init__(self, num):
        self.num = num
        self.count = 0

    def __len__(self):
        return self.num

    def __iter__(self):
        return self

    def __next__(self):
        if self.count < self.num:
            v = self.count
            self.count += 1
            return v
        else:
            self.count = 0
            raise StopIteration

That class is iterable because it "has an __iter__ method which returns an iterator"*1. An object of MyClass are also iterators because "an iterator is an object with a next (Python 2) or __next__ (Python 3) method. "*1. So far so good.

What's confusing me is a comment that stated "iterators are only supposed to be iterated once"*2. I don't understand why the following snippet gets stuck forever:

>>> y = MyClass(5)
>>> print([[i for i in y] for i in y])

The fix, of course, is to not reset the count member:

    def __next__(self):
        if self.count < self.num:
            v = self.count
            self.count += 1
            return v
        else:
            raise StopIteration

But now the list comprehension has to create new objects in the inner loop:

>>> y = MyClass(5)
>>> print([[i for i in MyClass(5)] for i in y])
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

Now, let's say that I want to be able to call my object many times. I tried to implement an non-iterator iterable class with:

class MyIterator():
    def __init__(self, num):
        self.num = num
        self.count = 0

    def __len__(self):
        return self.num

    def __iter__(self):
        return self.my_iterator()

    def my_iterator(self):
        while self.count < self.num:
            yield self.count
            self.count += 1
        self.count = 0

This works perfectly:

>>> x = MyIterator(5)
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print(list(x))
[0, 1, 2, 3, 4]

But the nested comprehension gets stuck:

>>> x = MyIterator(5)
>>> print([[i for i in x] for i in x])

And again the fix is to remove the line that resets the internal counter:

    def my_iterator(self):
        while self.count < self.num:
            yield self.count
            self.count += 1

And change the comprehension to create new objects in the inner loop:

>>> print([[i for i in MyIterator(5)] for i in x])
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

But the "fixed" class can't be iterated over more than once:

>>> x = MyIterator(5)
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print(list(x))
[]

What's the correct way to implement an non-iterator iterable (note that I *think I followed the last comment in this answer to the letter)? Or is this use case explicitly not supported by Python?

Edit:

Classic case of rubber duck debugging, I changed the last class to:

class MyIteratorFixed():
    def __init__(self, num):
        self.num = num

    def __len__(self):
        return self.num

    def __iter__(self):
        return self.my_iterator_fixed()

    def my_iterator_fixed(self):
        count = 0
        while count < self.num:
            yield count
            count += 1

What I had wrong is that I didn't need a count member because Python already holds the state of the iterator method (in this particular case the value of count).

>>> x = MyIteratorFixed(5)
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print([[i for i in x] for i in x])
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

My question are now:

Is this the correct way to implement a non-iterator iterable?
When should I use a iterator and when should I use a non-iterator iterable? Just the distinction of one of them being called just once?
What are the drawbacks of a non-iterator iterable compared to an iterator?

Thanks!!

The problem is that `next` is not re-entrant: you are trying to use a single attribute `self.count` to track the state of multiple, independent iterators. Your final attempt is correct because the `generator` object returned by `my_iterator_fixed` correctly implements `__iter__` by returning itself. — chepner, Aug 11 '21 at 21:46
"What are the drawbacks of a non-iterator iterable compared to an iterator?" The problem is you are thinking of these as separate things altogether, but in reality, the whole point is for *"non-iterator iterables* to *return an iterator which maintains its own state*. This is exactly the problem you are running in to. An iterator *encapsulates the state necessary to implement the logic of iterating*. Your iterable is using *internal state which ends up shared by all the iterators* — juanpa.arrivillaga, Aug 11 '21 at 22:18

score 4 · Answer 1 · answered Aug 11 '21 at 21:57

Yes, this is correct.
Usually, you want your iterator to be separate from the thing being iterated: it makes for a nice separation of concerns.
There are few, if any, drawbacks. Most iterable classes in Python do not act as their own iterators. File-like objects (which wrap file descriptors that already maintain their own file pointer) are the only exceptions that come to mind. For example,
```
>>> type(iter([]))
<class 'list_iterator'>
>>> type(iter(()))
<class 'tuple_iterator'>
>>> type(iter({}))
<class 'dict_keyiterator'>
>>> type(iter(set()))
<class 'set_iterator'>
```
None of the four types considered implement __iter__ by returning the object itself; they all return instances of a separate class.

score 0 · Answer 2 · answered Aug 11 '21 at 22:02

I figured a real life example of a non-iterator iterable might be helpful: I usually work with language data and often implement some kind of container class for documents that holds the words, sentences, parts-of-speech tags, syntactic information etc., but the central structure is usually some list of tokens:

class Document:
    def __init__(self, wordlist):
        self.tokens = wordlist

doc = Document(['Hello', 'World', '!'])

Whenever I need to iterate over the tokens, I could do for w in doc.tokens, but that's too cumbersome. So I would normally add __iter__ that returns the stored tokens as iterator:

class Document:
    def __init__(self):
        self.tokens = ['Hello', 'World', '!']
        
    def __iter__(self):
        return iter(self.words)

Now I can do for w in doc: which can be done unlimited times, and if the loop is broken in between, next time it will restart from the first word again, a behavior that seems quite natural to work with. But the object itself is not an iterator (because next() isn't implemented).

score 0 · Answer 3 · answered Aug 12 '21 at 03:01

0

My last iteration takes the hint from this answer

class MyIterator():
    def __init__(self, num):
        self.num = num

    def __iter__(self):
        count = 0
        while count < self.num:
            yield count
            count += 1

answered Aug 12 '21 at 03:01

Leonardo

1,533
17
28

Creating a non-iterator iterable

3 Answers3