26

When wrapping an (internal) iterator one often has to reroute the __iter__ method to the underlying iterable. Consider the following example:

class FancyNewClass(collections.Iterable):
    def __init__(self):
        self._internal_iterable = [1,2,3,4,5]

    # ...

    # variant A
    def __iter__(self):
        return iter(self._internal_iterable)

    # variant B
    def __iter__(self):
        yield from self._internal_iterable

Is there any significant difference between variant A and B? Variant A returns an iterator object that has been queried via iter() from the internal iterable. Variant B returns a generator object that returns values from the internal iterable. Is one or the other preferable for some reason? In collections.abc the yield from version is used. The return iter() variant is the pattern that I have used until now.

PeterE
  • 5,715
  • 5
  • 29
  • 51

1 Answers1

21

The only significant difference is what happens when an exception is raised from within the iterable. Using return iter() your FancyNewClass will not appear on the exception traceback, whereas with yield from it will. It is generally a good thing to have as much information on the traceback as possible, although there could be situations where you want to hide your wrapper.

Other differences:

  • return iter has to load the name iter from globals - this is potentially slow (although unlikely to significantly affect performance) and could be messed with (although anyone who overwrites globals like that deserves what they get).

  • With yield from you can insert other yield expressions before and after (although you could equally use itertools.chain).

  • As presented, the yield from form discards any generator return value (i.e. raise StopException(value). You can fix this by writing instead return (yield from iterator).

Here's a test comparing the disassembly of the two approaches and also showing exception tracebacks: http://ideone.com/1YVcSe

Using return iter():

  3           0 LOAD_GLOBAL              0 (iter)
              3 LOAD_FAST                0 (it)
              6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
              9 RETURN_VALUE
Traceback (most recent call last):
  File "./prog.py", line 12, in test
  File "./prog.py", line 10, in i
RuntimeError

Using return (yield from):

  5           0 LOAD_FAST                0 (it)
              3 GET_ITER
              4 LOAD_CONST               0 (None)
              7 YIELD_FROM
              8 RETURN_VALUE
Traceback (most recent call last):
  File "./prog.py", line 12, in test
  File "./prog.py", line 5, in bar
  File "./prog.py", line 10, in i
RuntimeError
ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • Assuming the goal is to simply transfer/expose the underlying iterable, i.e. no modification/injection of other items, I think I will stick with `iter()` because: 1) if I do not add any additional behavior, I do not need it to appear on the exception traceback (I think?). 2) to avoid using the global `iter()` I could call `return internal_iterable.__iter__()`. Regarding the third point you raised, I'll have to think about what that means. – PeterE May 12 '15 at 11:08