5

I am trying to understand how the for x in y statement works in python. I found the documentation here: https://docs.python.org/3/reference/compound_stmts.html#for. It says that the expression y is evaluated once and must yield an iterable object.

The following code prints the numbers 1,2,3,4,5 even though my class does not implement __iter__ (which is my understanding of being an iterable).

class myclass:
    def __init__(self):
        self.x = [1,2,3,4,5]
    def __getitem__(self,index):
        return self.x[index]
m = myclass()
for i in m:
    print(i)

I know that there is a built-in method iter() that returns an iterator for a sequence object using its .__getitem__() function and a counter that starts at 0.

My guess is that python is calling the iter() function on the expression y in the for x in y statement. So it is converting my object that implements .__getitem__ into an iterator, and when my object raises a IndexError exception during the .__getitem__ call, the iterator turns this into a StopIteration exception, and the for loop ends.

Is this correct? Right or wrong, is this explained in the documentation somewhere, or do I need to go look inside the source code of the implementation?

user3281410
  • 502
  • 3
  • 14
  • 1
    https://docs.python.org/3/c-api/iterator.html – Barmar Aug 03 '20 at 00:22
  • 7
    `__getitem__()` with sequential indexes, terminated by `IndexError`, was the original version of Python's iterator protocol, dating back to the Python 1.x days. Apparently, it's still supported, for backward compatibility. But yes, the `iter()` function is exactly equivalent to what a `for` loop does with its sequence object. – jasonharper Aug 03 '20 at 00:22
  • 2
    https://www.python.org/dev/peps/pep-0234/ – Barmar Aug 03 '20 at 00:23
  • 1
    @RichieV: The fact that lists have an `__iter__` method is actually completely irrelevant. This code would behave the same if lists did not have an `__iter__`. – user2357112 Aug 03 '20 at 00:24
  • Right -- this would be a way of making lists iterable if they weren't already. – Barmar Aug 03 '20 at 00:24
  • 1
    @jasonharper - interesting. I was surprised this work. I think it would be worth making that the answer. – tdelaney Aug 03 '20 at 00:29
  • @Barmar It would appear from reading the link you posted that the "PyObject_GetIter()" function is responsible for both the built-in `iter` function and the behavior of the for loop. That seems like a pretty conclusive answer to my question. – user3281410 Aug 03 '20 at 00:30
  • 1
    @RichieV: `iter` falls back to indexing with sequential integer indices if an object has no `__iter__`. – user2357112 Aug 03 '20 at 00:32
  • 1
    @RichieV - the iterator doesn't see any of the internal attribues such as the list in `self.x` (suppose the object had 2 lists). It just blindly thows 0, 1, 2, etc... at `__getitem__` until it fails. – tdelaney Aug 03 '20 at 00:33
  • Got it, I see "An object can be iterated over with for if it implements __iter__() or __getitem__()." From Barmar's reference. – RichieV Aug 03 '20 at 00:37
  • Does this answer your question? [Why does defining \_\_getitem\_\_ on a class make it iterable in python?](https://stackoverflow.com/questions/926574/why-does-defining-getitem-on-a-class-make-it-iterable-in-python) – mkrieger1 Aug 03 '20 at 22:45
  • @mkrieger1 That question explains why `__getitem__` makes a class iterable. My question was whether the way `y` in `for x in y` is evaluated into an iterator was the same as how the built-in function `iter(y)` constructs an iterator from `y`. A comment above provides a reference where it is explained that the implementation of both of these things are the same. The code I posted was just an example that helped motivate the question. – user3281410 Aug 04 '20 at 02:34

2 Answers2

0

According to PEP 234, which was helpfully linked in the comments above,

iter(obj) calls PyObject_GetIter(obj).

It has the following to say about for loops:

The Python bytecode generated for for loops is changed to use new opcodes, GET_ITER and FOR_ITER, that use the iterator protocol rather than the sequence protocol to get the next value for the loop variable. This makes it possible to use a for loop to loop over non-sequence objects that support the tp_iter slot. Other places where the interpreter loops over the values of a sequence should also be changed to use iterators.

Finally, https://docs.python.org/3/library/dis.html#opcode-GET_ITER explains that GET_ITER is equivalent to calling iter.

Putting this together, it seems that the for loop behaves the same as the built-in iter function.

user3281410
  • 502
  • 3
  • 14
-1

Happy Pythoning.

getitem was the only method before Python 2.2 version to iterate loops on iterators. In Pyhton 2.2 version iter method was introduced. In getitem method, index is automatically passed as 0 ie 0 index and increase by 1 every time loop runs iterates. We exit from loop once IndexError raises. To have the backward compatibility getitem method is still not deprecated (Till Python version 3.8.5). So now firstly loop searches iter method is searched in iterable and if this method is not present then getitem method is searched. So as long back Python 2.2 came, so today developers prefer to use iter and next methods, which gives a clear cut understanding about iterable and iterator.

You can take one more example of getitem along with your example to understand iterator and iterable as follows.

class myclass:
    def __init__(self):
        pass
    def __getitem__(self,index):
        return index
m = myclass()
for i in m:
    print(i)

Above code will create an infinite loop because first index value will be 0 then 1 then 2 and so on and no error will be raised.

Output of above code:

0
1
2
3
4
5
6
7
8
9
10
11
.
.
.
.

Thanks for sharing the question.