3

When answering this question, I came across something I never thought about in Python (pointed by a user).

Basically, I already know (here's an interesting thread about it) that I have to make a copy when iterating while mutating a list in Python in order to avoid strange behaviors.

Now, my question is, is using enumerate overcoming that problem ?

test_list = [1,2,3,4]
for index,item in enumerate(test_list):
    if item == 1:
        test_list.pop(index)

Would this code be considered safe or I should use,

for index,item in enumerate(test_list[:]):
scharette
  • 9,437
  • 8
  • 33
  • 67
  • But what would be safer? You are modifying original list anyway, right? As long as you don't delete items from list during loop it's fine to not create copy of the list. – Shan Jul 30 '18 at 19:53
  • @Shan I guess I've always been wrong. I thought modifying a list while iterating over it was forbidden in python and could leads to strange outputs. For the record, I thought that what would be safer is my second example. – scharette Jul 30 '18 at 19:55
  • 3
    Mutating the list's size or item ordering can cause unexpected behavior. Changing a single item's value is likely to be harmless. – David Reed Jul 30 '18 at 20:06
  • @scharette Copying is always safe—unless allocating a new list and copying all the elements causes a `MemoryError` or takes so long that your algorithm is no longer useful, of course. The question is whether not copying is "safe enough", and that's a trickier question than maybe it should be. – abarnert Jul 30 '18 at 20:27
  • Yes, I understand. My question was really about mutating a list. I create an oversimplified example for the sake of the question. Should I change the statement for `test_list.pop(index)` for example. That would be more appropriate. – scharette Jul 30 '18 at 20:31

2 Answers2

5

First, let’s answer your direct question:

enumerate doesn’t help anything here. It works as if it held an iterator to the underlying iterable (and, at least in CPython, that’s exactly what it does), so anything that wouldn’t be legal or safe to do with a list iterator isn’t legal or safe to do with an enumerate object wrapped around that list iterator.


Your original use case—setting test_list[index] = new_value—is safe in practice—but I’m not sure whether it’s guaranteed to be safe or not.

Your new use case—calling test_list.pop(index)—is probably not safe.


The most obvious implementation of a list iterator is basically just a reference to the list and an index into that list. So, if you insert or delete at the current position, or to the left of that position, you’re definitely breaking the iterator. For example, if you delete lst[i], that shifts everything from i + 1 to the end up one position, so when you move on to i + 1, you’re skipping over the original i + 1th value, because it’s now the ith. But if you insert or delete to the right of the current position, that’s not a problem.

Since test_list.pop(index) deletes at or left of the current position, it's not safe even with this implementation. (Of course if you've carefully written your algorithm so that skipping over the value after a hit never matters, maybe even that's fine. But more algorithms won't handle that.)

It’s conceivable that a Python implementation could instead store a raw pointer to the current position in the array used for the list storage. Which would mean that inserting anywhere could break the iterator, because an insert can cause the whole list to get reallocated to new memory. And so could deleting anywhere, if the implementation sometimes reallocates lists on shrinking. I don't think the Python disallows implementations that do all of this, so if you want to be paranoid, it may be safer to just never insert or delete while iterating.

If you’re just replacing an existing value, it’s hard to imagine how that could break the iterator under any reasonable implementation. But, as far as I'm aware, the language reference and list library reference1 don't actually make any promises about the implementation of list iterators.2

So, it's up to you whether you care about "safe in my implementation", "safe in every implementation every written to date", "safe in every conceivable (to me) implementation", or "guaranteed safe by the reference".

I think most people happily replace list items during iteration, but avoid shrinking or growing the list. However, there's definitely production code out there that at least deletes to the right of the iterator.


1. I believe the tutorial just says somewhere to never modify any data structure while iterating over it—but that’s the tutorial. It’s certainly safe to always follow that rule, but it may also be safe to follow a less strict rule.

2. Except that if the key function or anything else tries to access the list in any way in the middle of a sort, the result is undefined.

abarnert
  • 354,177
  • 51
  • 601
  • 671
  • Already up-voted. Just so you know though, I edited in order to now mutate the list which is much more logical in the context if this question. – scharette Jul 30 '18 at 20:47
  • @scharette OK, edited. What you're doing is probably not safe even in practice. – abarnert Jul 30 '18 at 20:55
1

Since it was my comment which lead to this, I'll add my follow up:

enumerate can be thought of as a generator so it will just take a sequence (any sequence or iterator actually) and just "generate" an incrementing index to be yielded with each item from the passed sequence (so it's not making a copy or mutating the list anyway just using an "enumerate object").

In the case of the code in that question, you were never changing the length of the list you were iterating over and once the if statement was run the value of the element did not matter. So the copy wasn't needed, it will be needed when an element is removed as the iterator index is shared with the list and does not account for removed elements.

The Python Ninja has a great example of when you should use a copy (or move to list comprehension)

LinkBerest
  • 1,281
  • 3
  • 22
  • 30
  • You're right it was a bad use-case. I actually edit my answer which now change the lenght. Logically, this is what I wanted to know. – scharette Jul 30 '18 at 20:46
  • Yep, that's why I said "you're not changing the length" :). If you mutate the length it is not good practice to use enumerate without a copy. Even then I may not actually do that in practice but move to list comprehension or a custom generator to avoid lost or incorrect data (or an error). I'll edit this in more directly when I have a minute. – LinkBerest Jul 30 '18 at 21:02
  • I understand, i gave an upvote since for my past example you were entirely right. – scharette Jul 30 '18 at 21:04