4

I am not able to understand why the following code goes in indefinite loop(when i am not using the copy list)

list = ["Mohit","kumar","sffsfshfsd"]
for w in list:
    if(len(w)) > 5:
        list.insert(0,w)
    print("inside loop")

print(list)  

The Above code prints inside loop indefinitely.

Now if in place of the list, i use a copy list like below works fine.

list = ["mohit","kumar","sffffgssddf"]

for w in list[:]:
    if len(w) > 5:
        list.insert(0,w)
    print("inside loop")

print(list)  

Now i have read in the python documentation that this is the behavior i will get but i want to understand the reason behind it. Thanks in advance.

Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
kumarmo2
  • 1,381
  • 1
  • 20
  • 35

6 Answers6

5

The first for loop for w in list will use an iterator (from iter(list)) to retrieve and loop through each item in the list. This iterator does not fetch the entire list immediately - it is lazy, meaning it only gets one item at a time from the list, when it's needed. You can learn about the iteration protocol here, or iteration/generators and laziness here.

Looping through indexes 0 and 1 do nothing, as their string lengths are less than 6. At index 2, however, you add "sffsfshfsd" to the beginning of list. Now list has grown and there's something at index 3: "sffsfshfsd". Iteration then continues, picking the value from the next index (3), which gets added at the beginning again, moving the same value which was at index 3 to index 4... The cycle never ends.

In your second loop w in list[:] you create a copy of the entire list (by using a slice operator) and iterate through that. You're adding items to the original list, not the copy, so the iterator won't touch the items that you've added.

PS: I tried to search the Python source code (which is C) to prove that list iterators in fact use an incrementing index (as described above). I'm not well versed in reading Python's source code, but here's what I found in cpython/listobject.c:

Iterator creation, sets starting index to 0

2797 static PyObject *
2798 list_iter(PyObject *seq)
2799 {
....
2806     it = PyObject_GC_New(listiterobject, &PyListIter_Type);
....
2809     it->it_index = 0;
....
2813     return (PyObject *)it;
2814 }

next uses it->it_index from above and then increments it

2831 static PyObject *
2832 listiter_next(listiterobject *it)
2833 {
....
2844         item = PyList_GET_ITEM(seq, it->it_index);
2845         ++it->it_index;
....
2847         return item;
....
2853 }

Seems legit to me?

Bilal Akil
  • 4,716
  • 5
  • 32
  • 52
4

To simulate how list iteration works internally, let's rewrite your program using integer indices and a while loop.

lst = ["Mohit", "kumar", "sffsfshfsd"]
pos = 0
while pos < len(lst):
  word = lst[pos]
  print('lst=%s pos=%d word=%s' % (lst, pos, word))
  if len(word) > 5:
    lst.insert(0, word)
  pos += 1

The following shows what happens when you run this:

lst=['Mohit', 'kumar', 'sffsfshfsd'] pos=0 word=Mohit
lst=['Mohit', 'kumar', 'sffsfshfsd'] pos=1 word=kumar
lst=['Mohit', 'kumar', 'sffsfshfsd'] pos=2 word=sffsfshfsd
lst=['sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd'] pos=3 word=sffsfshfsd
lst=['sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd'] pos=4 word=sffsfshfsd
lst=['sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd'] pos=5 word=sffsfshfsd
lst=['sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd'] pos=6 word=sffsfshfsd
lst=['sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd'] pos=7 word=sffsfshfsd
lst=['sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd'] pos=8 word=sffsfshfsd
lst=['sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd'] pos=9 word=sffsfshfsd
...

(This goes on until you run out of either RAM or patience.)

As you can see, you keep shifting the final 'sffsfshfsd' to the right, so your code keeps looking at it and never stops.

This doesn't happen if you work on a copy since you're no longer modifying the list you're iterating over.

It also wouldn't happen if you were to either adjust the loop index after the insertion:

  if len(word) > 5:
    lst.insert(0, word)
    pos += 1  # account for the extra word
  pos += 1

or move the word instead of copying it:

  if len(word) > 5:
    lst.insert(0, lst.pop(pos))  # don't change len(lst)
NPE
  • 486,780
  • 108
  • 951
  • 1,012
2

That happens because you're appending "sffsfshfsd" to list on each iteration starting from third, so list never ends.

bakatrouble
  • 1,746
  • 13
  • 19
2

In the first code, you are inserting elements on the very same list that you're looping. That's why it keeps going at the inner loop, because list is growing indefinitely. In the second code you're making a copy, separating your for loop and your original list, so it will eventually stop.

user2341726
  • 294
  • 1
  • 9
2

Quoting from the docs:

Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,

for x in a[:]:
    if x < 0: a.remove(x)

A for-loop over a list in Python maintains a counter internally and that is used to get the next item.

In your first code when it reaches sffsfshfsd(i.e index 2) you insert it to the start of list again, hence all items shift one place and now sffsfshfsd will be shifted to index 3 and will be picked up in the next iteration. And this goes on...

In your second code you're iterating over a copy of list and a copy of the list is not modified when you modify the original list.

lst = ["Mohit","kumar","sffsfshfsd"]
for i, w in enumerate(lst):
    print("Index: {i} | List: {list}".format(i=i, list=lst))
    if(len(w)) > 5:
        lst.insert(0, w)

Outputs:

Index: 0 | List: ['Mohit', 'kumar', 'sffsfshfsd']
Index: 1 | List: ['Mohit', 'kumar', 'sffsfshfsd']
Index: 2 | List: ['Mohit', 'kumar', 'sffsfshfsd']
Index: 3 | List: ['sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd']
Index: 4 | List: ['sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd']
Index: 5 | List: ['sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd']
Index: 6 | List: ['sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'sffsfshfsd', 'Mohit', 'kumar', 'sffsfshfsd']
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
1

I think this a very interesting question. I believe the answer should present itself in python source code implementation (sorry I could not find it and Hope someone expert could direct us to Python implementation)

for loop will not create a copy of your original data. Thus every time a new data added, the loop will continue. (I am not sure how for loop is achieved in implementation level, I do believe it might use iterator)

on the other hand [:], this operator will create a new copy of original data set. Thus no matter how you change original data set, the for loop is looping on a copy (which does not change).

Proof as following:

list = ["mohit","kumar","sffffgssddf"]
test = list
list.append("test")
print test 
#['mohit', 'kumar', 'sffffgssddf', 'test']

#clear data, let's try [:]
list = ["mohit","kumar","sffffgssddf"]
test = list[:]
list.append("test")
print test 
#['mohit', 'kumar', 'sffffgssddf']

Thus it is clear in your second example, your for loop is looping on a copy of original data. Thus original data set change will not affect the copy data. Thus your second example is working and first example will loop indefinitely.

Hope it helps.

White
  • 627
  • 4
  • 10
  • Thanks a lot @White. While others mostly addressed the first part of my question. u cleared the second part as well. thanks a lot . :) – kumarmo2 Jul 13 '17 at 09:28
  • @Manya you're welcome. Waiting for expert to show us Python core, that will be the definite answer. – White Jul 13 '17 at 09:37