For loops are mentioned in two places in the python docs (that I have found). I did try to find the source code for for
loops in cpython but to no avail.
Here's what I'm trying to understand: I had assumed for loops were a sort of while i <= len(iterable) then loop
or if i <= len(iterable) then loop:
. I'm not sure that's the case, and here's why:
y = [1, 2, 3, 4]
for x in y:
print(y)
print(y.pop(0))
Output:
[1, 2, 3, 4]
1
[2, 3, 4]
2
I know you shouldn't modify an iterable while you're looping through it. I know that. But still, this isn't a random result - it happens every time this code is run: 2 loops. You also get 2 loops if you run pop()
instead.
Maybe even curiouser, it seems like you reliably get len(y)+1//2
loops (at least using .pop()
, I haven't tried much other testing):
- if
y = [1, 2]
there is one loop - if
y = [1, 2, 3]
there are two loops - if
y = [1, 2, 3, 4]
there are still two loops - if
y = [1, 2, 3, 4, 5]
there are three loops - if
y = [1, 2, 3, 4, 5, 6]
there are still three loops - if
y = [1, 2, 3, 4, 5, 6, 7]
there are four loops
According to the Python docs:
Note
There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, e.g. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,
for x in a[:]:
if x < 0: a.remove(x)
Can anyone explain the logic Python uses when it is looping through an iterable that is modified during the loop? How do iter
and StopIteration
, and __getitem__(i)
and IndexError
factor in? What about iterators that aren't lists? And most importantly, is this / where is this in the docs?
As @Yang K suggested:
y = [1, 2, 3, 4, 5, 6, 7]
for x in y:
print("y: {}, y.pop(0): {}".format(y, y.pop(0)))
print("x: {}".format(x))
# Output
y: [2, 3, 4, 5, 6, 7], y.pop(0): 1
x: 1
y: [3, 4, 5, 6, 7], y.pop(0): 2
x: 3
y: [4, 5, 6, 7], y.pop(0): 3
x: 5
y: [5, 6, 7], y.pop(0): 4
x: 7