1

In Python, iterators are intended for one-time use. Once an iterator has raised StopIteration, it shouldn't return any more values. Yet if I define a custom iterator, it seems that I can still sum the values after they're exhausted!

Example code (Python 3.6.5, or replace __next__(self) with next(self) to see the same behaviour in Python 2.7.15):

class CustomIterator:
  def __iter__(self):
    self.n=0
    return self

  def __next__(self):
    self.n += 1
    if self.n > 3:
      raise StopIteration
    return self.n

i1 = iter([1,2,3])
i2 = iter(CustomIterator())

print('Sum of i1 is {}'.format(sum(i1))) # returns 6 as expected
print('Sum of i1 is {}'.format(sum(i1))) # returns 0 because i1 is now exhausted
try:
  print(next(i1))
except StopIteration:
  print("i1 has raised StopIteration") # this exception happens
print('Sum of i1 is {}'.format(sum(i1))) # 0 again

print('Sum of i2 is {}'.format(sum(i2))) # returns 6 as expected
print('Sum of i2 is {}'.format(sum(i2))) # returns 6 again!
try:
  print(next(i2))
except StopIteration:
  print("i2 has raised StopIteration") # still get an exception
print('Sum of i2 is {}'.format(sum(i2))) # and yet we get 6 again

Why do i1 and i2 behave differently? Is it some trick in how sum is implemented? I've checked https://docs.python.org/3/library/functions.html#sum and it doesn't give me a lot to go on.

Related questions:

These describe the expected behaviour for built-in iterators, but don't explain why my custom iterator behaves differently.

Alexander Hanysz
  • 791
  • 5
  • 15
  • Figured out the answer half way through typing the question! But I'm happy for other people to add their own, more informative, answers :-) – Alexander Hanysz Sep 04 '21 at 02:07
  • 1
    Because you are *manually resetting*: `self.n=0` in `__iter__`. – juanpa.arrivillaga Sep 04 '21 at 02:13
  • Very well asked question, and good job sharing the results of your research and analysis. I wouldn't be surprised to dupe-hammer other questions with this in the future. That said, nowadays I would rather take the approach of implementing `__iter__` with a generator, rather than writing a separate `__next__`. – Karl Knechtel Sep 04 '21 at 02:22
  • @KarlKnechtel, thanks for teaching me the word "dupe-hammer"! – Alexander Hanysz Sep 04 '21 at 02:25
  • 1
    See also e.g. https://treyhunner.com/2018/06/how-to-make-an-iterator-in-python/ for several alternative approaches. – Karl Knechtel Sep 04 '21 at 02:26

1 Answers1

2

The problem is that the custom iterator is initialising inside the __iter__ method. Even though i2 = iter(CustomIterator()) includes an explicit call to iter, the sum function (and also min, max, for, etc) will still call i2.__iter__() again and reset i2.

There's a bunch of tutorials out there on "how to make Python iterators", and about half of them say something like "to make an iterator, you just have to define iter and next methods". While this is technically correct as per the documentation, it will get you into trouble sometimes. In many cases you'll also want a separate __init__ method to initialise the iterator.

So to fix this problem, redefine CustomIterator as:

class CustomIterator:
  def __init__(self):
    self.n=0

  def __iter__(self):
    return self

  def __next__(self):
    self.n += 1
    if self.n > 3:
      raise StopIteration
    return self.n

i1 = iter([1,2,3])
i2 = CustomIterator() ### iter(...) is not needed here (but won't do any harm either)

Then init is called once and once only on creating a new iterator, and repeated calls to iter won't reset the iterator.

Alexander Hanysz
  • 791
  • 5
  • 15
  • *"This is not strictly correct"* - Actually it really is. – no comment Sep 04 '21 at 15:35
  • @don't talk just code, can you explain a little more? Can you suggest a better way to phrase that sentence? Thanks. – Alexander Hanysz Sep 05 '21 at 00:35
  • 1
    Well, those two methods are exactly what's required for iterators, see the [documentation](https://docs.python.org/3/library/stdtypes.html#iterator-types). Maybe something like "Practically, you'll also want an `__init__` method or other way to initialise the iterator". – no comment Sep 05 '21 at 01:09