I was reading What exactly are iterator, iterable, and iteration? and Build a basic Python iterator when I realized I don't understand in practice how an iterable class must be implemented.
Say that I have the following class:
class MyClass():
def __init__(self, num):
self.num = num
self.count = 0
def __len__(self):
return self.num
def __iter__(self):
return self
def __next__(self):
if self.count < self.num:
v = self.count
self.count += 1
return v
else:
self.count = 0
raise StopIteration
That class is iterable because it "has an __iter__
method which returns an iterator"*1. An object of MyClass
are also iterators because "an iterator is an object with a next
(Python 2) or __next__
(Python 3) method. "*1. So far so good.
What's confusing me is a comment that stated "iterators are only supposed to be iterated once"*2. I don't understand why the following snippet gets stuck forever:
>>> y = MyClass(5)
>>> print([[i for i in y] for i in y])
The fix, of course, is to not reset the count
member:
def __next__(self):
if self.count < self.num:
v = self.count
self.count += 1
return v
else:
raise StopIteration
But now the list comprehension has to create new objects in the inner loop:
>>> y = MyClass(5)
>>> print([[i for i in MyClass(5)] for i in y])
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
Now, let's say that I want to be able to call my object many times. I tried to implement an non-iterator iterable class with:
class MyIterator():
def __init__(self, num):
self.num = num
self.count = 0
def __len__(self):
return self.num
def __iter__(self):
return self.my_iterator()
def my_iterator(self):
while self.count < self.num:
yield self.count
self.count += 1
self.count = 0
This works perfectly:
>>> x = MyIterator(5)
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print(list(x))
[0, 1, 2, 3, 4]
But the nested comprehension gets stuck:
>>> x = MyIterator(5)
>>> print([[i for i in x] for i in x])
And again the fix is to remove the line that resets the internal counter:
def my_iterator(self):
while self.count < self.num:
yield self.count
self.count += 1
And change the comprehension to create new objects in the inner loop:
>>> print([[i for i in MyIterator(5)] for i in x])
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
But the "fixed" class can't be iterated over more than once:
>>> x = MyIterator(5)
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print(list(x))
[]
What's the correct way to implement an non-iterator iterable (note that I *think I followed the last comment in this answer to the letter)? Or is this use case explicitly not supported by Python?
Edit:
Classic case of rubber duck debugging, I changed the last class to:
class MyIteratorFixed():
def __init__(self, num):
self.num = num
def __len__(self):
return self.num
def __iter__(self):
return self.my_iterator_fixed()
def my_iterator_fixed(self):
count = 0
while count < self.num:
yield count
count += 1
What I had wrong is that I didn't need a count
member because Python already holds the state of the iterator method (in this particular case the value of count
).
>>> x = MyIteratorFixed(5)
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print(list(x))
[0, 1, 2, 3, 4]
>>> print([[i for i in x] for i in x])
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
My question are now:
- Is this the correct way to implement a non-iterator iterable?
- When should I use a iterator and when should I use a non-iterator iterable? Just the distinction of one of them being called just once?
- What are the drawbacks of a non-iterator iterable compared to an iterator?
Thanks!!