0

I encountered an interesting and puzzling situation while trying to remove all the empty strings from a list. I wrote the below code the first time.

lst=['###','','@@@','','$$$','','','%%%','','&&&']
print "len:",len(lst)

iteration=1
for item in lst:
    print iteration,":",lst,":",len(lst),":","'%s'"%item
    if item!='':
        pass
    else:
        lst.remove(item)
    iteration+=1

It produces the following output:

len: 10
1 : ['###', '', '@@@', '', '$$$', '', '', '%%%', '', '&&&'] : 10 : '###'
2 : ['###', '', '@@@', '', '$$$', '', '', '%%%', '', '&&&'] : 10 : ''
3 : ['###', '@@@', '', '$$$', '', '', '%%%', '', '&&&'] : 9 : ''
4 : ['###', '@@@', '$$$', '', '', '%%%', '', '&&&'] : 8 : ''
5 : ['###', '@@@', '$$$', '', '%%%', '', '&&&'] : 7 : '%%%'
6 : ['###', '@@@', '$$$', '', '%%%', '', '&&&'] : 7 : ''

NOTE: The code doesn't work like it should. There are some empty strings in the output. I later found better ways like: list comprehensions: [x for x in lst if x!=''] or creating a new list and copying the non empty strings to it, which happens to be more efficient than the above code because it doesn't involve shifting the position every time you remove an element from the list.

I however have some questions regarding the output of the code above.

First question is, why doesn't the loop run ten times(the iteration number is on the far left) because the original length of the list is ten. Second, if you look at the rightmost column, you realize that it doesn't print print the @@@ string. It totally skips it!! My theory is that the in operator is sugar(most likely) for an index so that even if the length of the list changes the index keeps increasing by one. This would explain why on the third iteration the value of i is the empty string and not the @@@ since lst[2] is ''.

Is there something I need to know when using the in operator?

Plakhoy
  • 1,846
  • 1
  • 18
  • 30
  • use `lst = [t for t in lst if t]`. – Elazar Jun 27 '13 at 23:08
  • I'm not asking how to do it. I know that. I want an explanation for the behaviour. – Plakhoy Jun 27 '13 at 23:18
  • Ashwini Chaudhary points you to an explanation. – Elazar Jun 27 '13 at 23:20
  • Sorry about that. The explanation is quite good. I think I get it now. It doesn't however explain why the loop runs 6 times instead of ten. – Plakhoy Jun 27 '13 at 23:42
  • @Segfault It does, the number of items are getting reduced due to removal. so at `index = 6` an `StopIteration` error is raised and the for-loop terminates.(`StopIteration` error is handled silently by for-loop) – Ashwini Chaudhary Jun 28 '13 at 21:30

2 Answers2

2

Any time you remove things during a loop that you are iterating over you will get weird results like this. If you iterate over a slice [:], the string will no longer disappear

for item in lst[:]:

creates a copy to iterate over so that you can manipulate the elements of the list without affecting the iteration

this post describes what happens when you modify a list as you iterate over it.

Community
  • 1
  • 1
Stephan
  • 16,509
  • 7
  • 35
  • 61
  • why copy the entire string and then remove, when you can copy only the things you want to into a new list? – Elazar Jun 27 '13 at 23:19
0

Internally, iterating over a list uses an index. This index is incremented by 1 each pass through the loop, and used to retrieve the desired element. If you delete an element while iterating, a new element will be "moved into" the slot you're looking at, and then the next iteration of the loop will look at the next one, so first the element that got moved never gets looked at.

A few solutions for this include:

  • Create a new list (using a list comprehension, perhaps) rather than removing items from an old one. This new list can be assigned back into the original container if desired using a slice assignment: lst[:] = (item for item if item != "") This is fastest because it avoids having to move items multiple times when deleting.
  • Iterate over the list in reverse order using reversed() so you are only ever "moving" items you have already seen.
  • Iterate over a copy of the list but modify the original list.

In your case you don't actually need to iterate over the list. You just want to delete the empty strings from it. So a fourth option is to just delete empty strings until there aren't any more!

try:
     while True:
         lst.remove('')   # deletes first empty string
except ValueError:        # no more empty strings
     pass
kindall
  • 178,883
  • 35
  • 278
  • 309
  • Where does the new element come from? Shouldn't it be the element in front of the current one? – Plakhoy Jun 27 '13 at 23:23
  • If you are looking at element 0, and delete it, then element 1 becomes element 0. Then, the next iteration of the loop begins, and you move on to the new element 1, which was element 2. The old element 0, which was element 1, is never looked at. – kindall Jun 28 '13 at 00:08