81

I know that it is not allowed to remove elements while iterating a list, but is it allowed to add elements to a python list while iterating. Here is an example:

    for a in myarr:
      if somecond(a):
          myarr.append(newObj())

I have tried this in my code and it seems to work fine, however I don't know if it's because I am just lucky and that it will break at some point in the future?

EDIT: I prefer not to copy the list since "myarr" is huge, and therefore it would be too slow. Also I need to check the appended objects with "somecond()".

EDIT: At some point "somecond(a)" will be false, so there can not be an infinite loop.

EDIT: Someone asked about the "somecond()" function. Each object in myarr has a size, and each time "somecond(a)" is true and a new object is appended to the list, the new object will have a size smaller than a. "somecond()" has an epsilon for how small objects can be and if they are too small it will return "false"

WesDec
  • 2,098
  • 4
  • 19
  • 23
  • Copying a list doesn't take much time. It's a shallow copy, not a deep copy. – S.Lott Sep 20 '10 at 15:13
  • @S.Lott: The list can easily be over 100 million elements, and the above loop is repeated many many times. Even a shallow copy would be slow. – WesDec Sep 20 '10 at 15:18
  • 3
    Since you say you have done this, does your loop iterate over the appended items as well as those originally in the list? – Mike DeSimone Sep 20 '10 at 15:19
  • @WesDec: You appear to be talking about breadth-first search. A simple list is inappropriate for what you're doing. This sounds like some kind of tree. Not a simple list. – S.Lott Sep 20 '10 at 15:39
  • 3
    @WesDec: Also, don't add comments apologizing. Just focus on getting the question right. – S.Lott Sep 20 '10 at 15:46
  • Approaches to this problem will depend on whether or not it is desired for the iteration to iterate over the new element. – Karl Knechtel Jul 04 '22 at 01:49

12 Answers12

62

Why don't you just do it the idiomatic C way? This ought to be bullet-proof, but it won't be fast. I'm pretty sure indexing into a list in Python walks the linked list, so this is a "Shlemiel the Painter" algorithm. But I tend not to worry about optimization until it becomes clear that a particular section of code is really a problem. First make it work; then worry about making it fast, if necessary.

If you want to iterate over all the elements:

i = 0  
while i < len(some_list):  
  more_elements = do_something_with(some_list[i])  
  some_list.extend(more_elements)  
  i += 1  

If you only want to iterate over the elements that were originally in the list:

i = 0  
original_len = len(some_list)  
while i < original_len:  
  more_elements = do_something_with(some_list[i])  
  some_list.extend(more_elements)  
  i += 1
J0e3gan
  • 8,740
  • 10
  • 53
  • 80
Johnny
  • 645
  • 1
  • 5
  • 3
  • 20
    Python's lists are like C arrays or C++ vectors; indexing them is constant-time. This is actually a pretty good solution, in that it does what the OP's algorithm does, without relying on undefined behavior. – Petr Viktorin Sep 25 '11 at 10:52
  • 1
    But that is really c way, why not use at least: for i in range(len(some_list))? – borgr Jan 18 '22 at 07:09
28

well, according to http://docs.python.org/tutorial/controlflow.html

It is not safe to modify the sequence being iterated over in the loop (this can only happen for mutable sequence types, such as lists). If you need to modify the list you are iterating over (for example, to duplicate selected items) you must iterate over a copy.

Rohan Monga
  • 1,759
  • 1
  • 19
  • 31
  • I am aware of that text, but often iterators in other languages support appending new elements to the end of the list, while iterating. I was hoping that python also supports that since it would make things more simple and more readable. – WesDec Sep 21 '10 at 06:53
  • i agree but the only solution is to make a copy, which is also explained in the tutorial. – Rohan Monga Sep 21 '10 at 08:04
  • 2
    What about iterating over the list using an index rather than `for a in myarr`? I.e. `i = 0; while i < len(myarr): a = myarr[i]; i = i + 1; if somecond(a): myarr.append(newObj())` – HelloGoodbye Jan 23 '14 at 09:19
  • 5
    The documentation seems to have changed and now it doesn't say it isn't safe, just that it _can be tricky_ : **Code that modifies a collection while iterating over that same collection can be tricky to get right. Instead, it is usually more straight-forward to loop over a copy of the collection or to create a new collection:** – Motti May 06 '20 at 09:17
12

You could use the islice from itertools to create an iterator over a smaller portion of the list. Then you can append entries to the list without impacting the items you're iterating over:

islice(myarr, 0, len(myarr)-1)

Even better, you don't even have to iterate over all the elements. You can increment a step size.

Boris Dalstein
  • 7,015
  • 4
  • 30
  • 59
wheaties
  • 35,646
  • 15
  • 94
  • 131
  • Please do NOT do this. As far as I understand, this is undefined behavior and can lead to a crash. A Python's list is like a dynamic C-Array (or C++ std::vector) under the hood: adding an element might cause a re-allocation of the whole array to fit the new element. In case such re-allocation occurs, then I believe the islice() would point to the old, now-dangling memory. The most-upvoted answer definitely seems like the only fast and safe way to add elements while iterating a list. It says "it won't be fast" but this is incorrect, it's pretty much as fast as normal iteration. – Boris Dalstein Aug 13 '20 at 12:57
  • 2
    Sorry for the downvote I was wrong, you now get an upvote instead :) (note: I had to edit your answer to be able to change my vote) A good test was indeed as suggest in another answer: `a = [0]\nfor i in a:\n print(a)\n if i < 100:\n a.append(i+1)`. This correctly prints all integers from 0 to 100. This proves that python list iterators actually don't point directly to memory: they store a pointer to the list object and an index. – Boris Dalstein Aug 13 '20 at 13:23
10

In short: If you'are absolutely sure all new objects fail somecond() check, then your code works fine, it just wastes some time iterating the newly added objects.

Before giving a proper answer, you have to understand why it considers a bad idea to change list/dict while iterating. When using for statement, Python tries to be clever, and returns a dynamically calculated item each time. Take list as example, python remembers a index, and each time it returns l[index] to you. If you are changing l, the result l[index] can be messy.

NOTE: Here is a stackoverflow question to demonstrate this.

The worst case for adding element while iterating is infinite loop, try(or not if you can read a bug) the following in a python REPL:

import random

l = [0]
for item in l:
    l.append(random.randint(1, 1000))
    print item

It will print numbers non-stop until memory is used up, or killed by system/user.

Understand the internal reason, let's discuss the solutions. Here are a few:

1. make a copy of origin list

Iterating the origin list, and modify the copied one.

result = l[:]
for item in l:
    if somecond(item):
        result.append(Obj())

2. control when the loop ends

Instead of handling control to python, you decides how to iterate the list:

length = len(l)
for index in range(length):
    if somecond(l[index]):
        l.append(Obj())

Before iterating, calculate the list length, and only loop length times.

3. store added objects in a new list

Instead of modifying the origin list, store new object in a new list and concatenate them afterward.

added = [Obj() for item in l if somecond(item)]
l.extend(added)
Community
  • 1
  • 1
cizixs
  • 12,931
  • 6
  • 48
  • 60
6

You can do this.

bonus_rows = []
for a in myarr:
  if somecond(a):
      bonus_rows.append(newObj())
myarr.extend( bonus_rows )
S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • This is no good since i also need to check the objects in the bonus_rows, and if somecond() is true for some of those i need to create new objects for those also. – WesDec Sep 20 '10 at 15:34
  • @WesDec: That's why you have "nested" loops. Surround all of this in a larger loop. – S.Lott Sep 20 '10 at 15:38
  • 2
    @WesDec: or stop using a simple list and use a tree. This sounds like breadth-first search, for which a list is the wrong structure. – S.Lott Sep 20 '10 at 15:45
  • @S.Lott Class based trees in pure python can be inefficient. – tejasvi88 Dec 03 '21 at 11:28
4

Access your list elements directly by i. Then you can append to your list:

for i in xrange(len(myarr)):
    if somecond(a[i]):
        myarr.append(newObj())
Mike DeSimone
  • 41,631
  • 10
  • 72
  • 96
eumiro
  • 207,213
  • 34
  • 299
  • 261
  • @Justin Peel - you were faster with this solution. – eumiro Sep 20 '10 at 15:12
  • This is the one, if you want to use the same loop code on the appended items while still keeping it DRY. Doesn't seem like any of the other solutions anticpate that use case. – NeilG Oct 27 '20 at 08:25
3

I had a similar problem today. I had a list of items that needed checking; if the objects passed the check, they were added to a result list. If they didn't pass, I changed them a bit and if they might still work (size > 0 after the change), I'd add them on to the back of the list for rechecking.

I went for a solution like

items = [...what I want to check...]
result = []
while items:
    recheck_items = []
    for item in items:
        if check(item):
            result.append(item)
        else:
            item = change(item)  # Note that this always lowers the integer size(),
                                 # so no danger of an infinite loop
            if item.size() > 0:
                recheck_items.append(item)
    items = recheck_items  # Let the loop restart with these, if any

My list is effectively a queue, should probably have used some sort of queue. But my lists are small (like 10 items) and this works too.

RemcoGerlich
  • 30,470
  • 6
  • 61
  • 79
3

You can use an index and a while loop instead of a for loop if you want the loop to also loop over the elements that is added to the list during the loop:

i = 0
while i < len(myarr):
    a = myarr[i];
    i = i + 1;
    if somecond(a):
        myarr.append(newObj())
HelloGoodbye
  • 3,624
  • 8
  • 42
  • 57
3

make copy of your original list, iterate over it, see the modified code below

for a in myarr[:]:
      if somecond(a):
          myarr.append(newObj())
shahjapan
  • 13,637
  • 22
  • 74
  • 104
  • The problem is that the list is huge so it would be very slow to copy it everytime. – WesDec Sep 20 '10 at 15:02
  • @WesDec: Please **measure** before declaring it "very slow". It's a shallow copy. It's pretty quick. – S.Lott Sep 20 '10 at 15:14
  • 1
    @S.Lott: The list can easily be over 100 million elements, and the above loop is repeated many many times. Even a shallow copy would be slow. – WesDec Sep 20 '10 at 15:18
2

Expanding S.Lott's answer so that new items are processed as well:

todo = myarr
done = []
while todo:
    added = []
    for a in todo:
        if somecond(a):
            added.append(newObj())
    done.extend(todo)
    todo = added

The final list is in done.

Mike DeSimone
  • 41,631
  • 10
  • 72
  • 96
1

Alternate solution :

reduce(lambda x,newObj : x +[newObj] if somecond else x,myarr,myarr)
Akshay Hazari
  • 3,186
  • 4
  • 48
  • 84
1

Assuming you are adding at the last of this list arr, You can try this method I often use,

arr = [...The list I want to work with]
current_length = len(arr)
i = 0
while i < current_length:
    current_element = arr[i]
    do_something(arr[i])
    # Time to insert
    insert_count = 1 # How many Items you are adding add the last
    arr.append(item_to_be inserted)
    # IMPORTANT!!!!  increase the current limit and indexer
    i += 1
    current_length += insert_count

This is just boilerplate and if you run this, your program will freeze because of infinite loop. DO NOT FORGET TO TERMINATE THE LOOP unless you need so.