0

Edit: I know to iterate over a copy of my list when I want to modify the original. However, the only explanation I've ever received on what's wrong with modifying a list while iterating over it is that "it can lead to unexpected results."

Consider the following:

lst = ['a', 'b', 'c', 'd', 'e']
for x in lst:
    lst.remove(x)
print(lst)

Here is my attempt at explaining what actually happens when one modifies a list while iterating over it. Note that line2 is equivalent to for i in range(len(lst)):, and that len(lst) decreases by 1 with every iteration.

len(lst) begins as 5.

When i = 0, we have lst[i] = 'a' being removed, so lst = ['b', 'c', 'd', 'e']. len(lst) decreases to 4.

When i = 1, we have lst[i] = 'c' being removed, so lst = ['b', 'd', 'e'] len(lst) decreases to 3.

When i = 2, we have lst[i] = 'e' being removed, so lst = ['b', 'd']. len(lst) decreases to 2.

This is where I thought an IndexError would be raised, since i = 2 is not in range(2). However, the program simply outputs ['b', 'd']. Is it because i has "caught up" with len(lst)? Also, is my reasoning sound so far?

jessica
  • 231
  • 1
  • 3
  • 9
  • Possible duplicate of [How to modify list entries during for loop?](https://stackoverflow.com/questions/4081217/how-to-modify-list-entries-during-for-loop) – John Lyon May 02 '18 at 05:27
  • Copy your list, and use your indexing on that copy. – BcK May 02 '18 at 05:28
  • 2
    @jozzas she is asking how the iteration works. I didn't see that answered your referenced question. – tdelaney May 02 '18 at 05:30
  • @BcK my intention is not to clear the list; I just want to understand what happens in the background. – jessica May 02 '18 at 05:31
  • I haven't dug through it, but the implementation is at https://github.com/python/cpython/blob/master/Objects/listobject.c look for `list_iterator`. – tdelaney May 02 '18 at 05:37
  • _'Note that line2 is equivalent to for i in range(len(lst))'_ is totally wrong, it's `iterable` –  May 02 '18 at 05:38
  • @tdelaney thanks for the link. Unfortunately I don't understand what it says. – jessica May 02 '18 at 05:44
  • 1
    Possible duplicate of [Removing from a list while iterating over it](https://stackoverflow.com/questions/6500888/removing-from-a-list-while-iterating-over-it) This duplicate's top answer discusses (with a good visualization) what is happening here. It is slightly different since not all elements are removed, but I think close enough to be a dupe. – user3483203 May 02 '18 at 05:50
  • @chrisz thanks for the link. While the top answer is enlightening, my question is why the caret shifts the way it does, which isn't explained in that link (or at least, I don't see it). – jessica May 02 '18 at 06:00

3 Answers3

1

The C implementation is in the listiter_next function in listobject.c and the pertinent lines are

if (it->it_index < PyList_GET_SIZE(seq)) {
    item = PyList_GET_ITEM(seq, it->it_index);
    ++it->it_index;
    Py_INCREF(item);
    return item;
}

it->it_seq = NULL;
Py_DECREF(seq);
return NULL;

The iterator returns an object if its still in range (it->it_index < PyList_GET_SIZE(seq)) and returns NONE otherwise. It doesn't matter if you are off by 1 or a million, its not an error.

The general reason for doing things this way is that iterators and iterables can be consumed in multiple places (consider a file object that is read inside a for loop). An outer loop shouldn't crash with an IndexError just because its run out of things to do. Its not illegal or inherently "stupid" to change an object you are iterating, its just that you need to know the consequences of your actions.

tdelaney
  • 73,364
  • 6
  • 83
  • 116
0

"Note that line2 is equivalent to for i in range(len(lst))"

I don't think it is
The for loop in Python iterates over a list using the integrated next function. So at the end you get a stop iteration error, raised by next if the iterable you are iterating over is done. But this error is automatically caught by the for loop.

Bernhard
  • 1,253
  • 8
  • 18
  • Could you explain why the equivalence doesn't hold? This is what I gathered from your second paragraph: when `i = 2`, StopIteration exception is raised because `i` cannot increase any further (because `i` cannot go beyond `range(len(lst))`). StopIteration what causes the program to exit the for loop, and therefore to terminate. Is that right? – jessica May 02 '18 at 05:47
0

You should be able to tell if you print the x in the process,

lst = [1, 2, 3, 4, 5]
for x in lst:
    print(x)
    lst.remove(x)

# 1
# 3
# 5

What happens is, you are removing the 1 from the list at first. Because you removed 1, instead of proceeding to 2, you proceed to 3. Then remove the 3 from the list. Now same procedure applies, instead of proceeding to number 4, you are proceeding to number 5 and removing that number from the list. So you have completed your iteration.

By the way, for x in lst is not the same as for x in range(len(lst)), this might be the point where you are confused.

In the first case, python creates an iterable from your list, and calls next method on every iteration, so when you reach at the end of the list, StopIteration error is raised, causing the iteration process to stop. In the second case, you should handle that yourself explicitly. That means, python does not create an iterable from your list, you should keep track of where you are.


I suggest you to read the article to learn the difference between an iterable and an iterator and how they work:

Iterator vs Iterable

BcK
  • 2,548
  • 1
  • 13
  • 27
  • "Because you removed one, instead of proceeding to 2, you proceed to 3". I don't see why this happens, if my explanation weren't correct. Could you also explain why `for x in lst` and `for i in range(len(lst))` are not the same? – jessica May 02 '18 at 05:43
  • @jessica Clarified the answer a little bit more. – BcK May 02 '18 at 05:43