19

It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy. The slice notation makes this particularly convenient:

   >>> for x in a[:]: # make a slice copy of the entire list
   ...    if len(x) > 6: a.insert(0, x)
   ... 
   >>> a
   ['defenestrate', 'cat', 'window', 'defenestrate']

why is it not safe to just do for x in a ??

BartoszKP
  • 34,786
  • 15
  • 102
  • 130
Alex Gordon
  • 57,446
  • 287
  • 670
  • 1,062

3 Answers3

17

Without getting too technical:

If you're iterating through a mutable sequence in Python and the sequence is changed while it's being iterated through, it is not always entirely clear what will happen. If you insert an element in the sequence while iterating through it, what would now reasonably be considered the "next" element in the sequence? What if you delete the next object?

For this reason, iterating through a mutable sequence while you're changing it leads to unspecified behaviour. Anything might happen, depending on exactly how the list is implemented. :-)

pv2b
  • 3,359
  • 1
  • 15
  • 3
  • Are there any restrictions on "anything"? E.g. if you write a C extension and it crashes when the list it modified while being iterated, would that be considered horribly broken? – Justin Olbrantz Jul 25 '18 at 00:42
13

This is a common problem in many languages. If you have a linear data structure, and you are iterating over it, something must keep track of where you are in the structure. It might be a current index, or a pointer, but it's some kind of finger pointing to the "current place".

If you modify the list while the iteration is happening, that cursor will likely be incorrect.

A common problem is that you remove the item under the cursor, everything slides down one, the next iteration of the loop increments the cursor, and you've inadvertently skipped an item.

Some data structure implementations offer the ability to delete items while iterating, but most do not.

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
  • 1
    In `for x in a:`, is `a` evaluated on each iteration? – haccks Mar 04 '15 at 16:11
  • 1
    No, it is evaluated once to get an iterator, then the iterator is used for the loop. – Ned Batchelder Mar 04 '15 at 17:15
  • OK. You said: *A common problem is that you remove the item under the cursor, everything slides down one, the next iteration of the loop increments the cursor, and you've inadvertently skipped an item.*: How that iterator would be informed to any modification to `a` if `a` is evaluated only once ? – haccks Mar 04 '15 at 17:24
  • 2
    This is the point: the iterator is not informed about the modifications, and therefore, it misses or skips items. – Ned Batchelder Mar 05 '15 at 16:43
  • It is said [here](https://docs.python.org/2/reference/compound_stmts.html#for) that: *The expression list is evaluated once; it should yield an iterable object.* Once the iterable object is yield then any modification to list is not informed to iterator. Then why item is being skipped or treated again? What exactly the statement "*The expression list is evaluated once*" means here? – haccks Mar 05 '15 at 17:13
  • 1
    Think of the iterator object as just a pointer into the sequence. For example, when iterating a list, the iterator has two values: a reference to the list, and the current index in the iteration. Every time around the loop, the index is incremented, and that element from the list is produced. If you change the list, the iterator is not informed. But it has a reference to the list, and the list has changed. It doesn't have a copy of the list. – Ned Batchelder Mar 06 '15 at 19:55
  • The Linux kernel provides `list_for_each_entry_safe` which explicitly avoids this problem, at the expense of another iterator variable. – Jonathon Reinhart Aug 26 '16 at 12:15
  • Does this mean doing a.append() is safe because it adds an element to the end of the list? or is that also implementation dependent? For instance will reallocation break this? Also, how and when is the loop termination condition checked (pointer >= len(a))? – Tayyar R Mar 30 '21 at 18:23
  • @TayyarR From the [documentation](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations): "Forward and reversed iterators over mutable sequences access values using an index. That index will continue to march forward (or backward) even if the underlying sequence is mutated. The iterator terminates only when an IndexError or a StopIteration is encountered (or when the index drops below zero)." The implications are straightforward: `a.append` "is safe", but the appended elements will be iterated over as well. – Karl Knechtel Jul 07 '22 at 07:53
  • (Types implementing [the `Sequence` abstract base class](https://docs.python.org/3/library/collections.abc.html#collections-abstract-base-classes) also have to implement `len`, but that doesn't limit forward iteration - it just determines a starting point for reverse iteration, so that the mixin can implement `__reversed__`.) – Karl Knechtel Jul 07 '22 at 07:54
3

When you modify a collection that you are iterating over the iterator can behave unexpectedly, for example missing out items or returning the same item twice.

This code loops indefinitely when I run it:

>>> a = [ 'foo', 'bar', 'baz' ]
>>> for x in a:
...    if x == 'bar': a.insert(0, 'oops')

This is because the iterator uses a numeric index to keep track of where it is in the list. Adding an item at the beginning of the list results in the item 'bar' being returned again instead of the iterator advancing to the next item, because it has been shifted forward by the insertion.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452