0

Example code:

found_list = [1,2,3,4,5]
remove_list =[1,2,3,4,5]

for canidate in found_list:
    if canidate in remove_list:
        found_list.remove(canidate)

# Expecting this to be an EMPTY list. 
# because everything in found is in remove
print("NEW FOUND LIST: %s" % found_list )

In my actual case, I am using "os.walk()" - and need to prune/ignore some subdirectories. The example is a list of subdirectories (ie: names like: ".svn" and ".git") I want to ignore

The examples I have seen are to use for() over the directory or file list. And use list.remove() to remove the specific item directly.

However - things are not being deleted like I expect. I think this is a limitation (?bug?) in the way that the for() loop iterates over a list ie: If you delete the current item the next item is skipped and not considered.

Is this documented anywhere?

My workaround solution is to create a new list and then assign that to the list given by os.walk()

Thanks.

Ch3steR
  • 20,090
  • 4
  • 28
  • 58
user3696153
  • 568
  • 5
  • 15
  • You could change your loop to loop through the `remove_list` and then remove the element in the `found_list`. That way you're never changing the list that you're iterating over. – Borisonekenobi Aug 16 '23 at 18:54
  • This is a well known mistake - you are modifying a List you are iterating over which compromises the index used to access the List. – user19077881 Aug 16 '23 at 18:54
  • "My workaround solution is to create a new list and then assign that to the list given by os.walk()" so, to be clear, you need something like `dirs[:] = [d for d in dirs if d not in (".svn", ".git")]`, where you actually have to mutate the list using slice-assignment if you are working with os.walk IIRC – juanpa.arrivillaga Aug 16 '23 at 18:59

1 Answers1

1

I re-opened because the question is about where this can be found in the documentation, the asker already understands the mechanism and has found a work-around.

And yes, this is clearly documented.

This is described here on the documentation of built-in sequence types:

Forward and reversed iterators over mutable sequences access values using an index. That index will continue to march forward (or backward) even if the underlying sequence is mutated. The iterator terminates only when an IndexError or a StopIteration is encountered (or when the index drops below zero).

So note, all mutable built-in sequence types will display this behavior:

>>> arr = bytearray(b'abcd')
>>> for i in arr:
...     arr.remove(i)
...
>>> arr
bytearray(b'bd')
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172