Python: Remove items from list while iterating, which have non-trivial removal conditions

Question

This is not a dupe of other questions I have found such as:

Remove items from a list while iterating

Python: Removing list element while iterating over list

The problem is this: given a list of classes, such as abstracted sockets, what is the most Pythonic way to remove them, if there is a non-trivial way of determining if they should be removed?

sockets = [ socket1, socket2, socket3 ] # whatever
for sock in sockets:
    try:
        sock.close()
    except:
        pass
    else:
        remove the socket from the list here!

Cannot use the solution from either link. The "best" solution I can think of with my limited Python knowledge is to create a new list with only the ones that encountered exceptions appended to it.

sockets = [ socket1, socket2, socket3 ] # whatever
newsockets = []
for sock in sockets:
    try:
        sock.close()
    except:
        newsockets.append(sock)
sockets = newsockets

This still feels wrong, however. Is there a better way?

EDIT for moderator who ignored my explicit statement that the question this was marked as a dupe of is not a dupe.

To the first link I posted, you cannot use try/except in a list comprehension. To the second link (the one it was marked as a dupe of), as the comments say, that is a bad solution. remove(non-hashable item or item that doesn't have __eq__) does not work.

Please give us your definition of when something is *most pythonic*. If you don't have a clear definition for that, and a way to make decision about it, you'll always wonder if there is some greener grass out there. — Anthon, Dec 26 '16 at 18:34
@nexus66 `pop`-ping elements from a list over which you are walking with a `for` loop is guaranteed to give problems. — Anthon, Dec 26 '16 at 18:35
@Anthon Thanks for your comment. Can you explain which kind of problems ? Thanks. — Chiheb Nexus, Dec 26 '16 at 18:36
Creating a new list, is the best way. Why does that feel wrong? — Daniel, Dec 26 '16 at 18:36
@nexus66 when you iterate and remove at the same time some elements will be skipped in your loop. — Keiwan, Dec 26 '16 at 18:37
For every none-trivial way to filter you can write a function. So duplicate of your second link. — Daniel, Dec 26 '16 at 18:39
Your code is perfectly readable and I don't know why it feels wrong. You can't have `try..except` in a list comprehension or anything similar if that's what you would prefer. The only thing I might change is to assign the `newsockets` list to a slice of `sockets[:]` — Keiwan, Dec 26 '16 at 18:39
@Keiwan can you elaborate please? Also, anyone know why this was marked as duplicate when I specified that the other question I supposedly dupe'd does not answer my question and gave why it doesn't? — std''OrgnlDave, Dec 27 '16 at 00:49

score 3 · Answer 1 · answered Dec 26 '16 at 18:38

Usually the proper way algorithmically is to build a new list, then either replace the original list with the new list, or slice it in there, if you expect to remove a substantial number of the sockets:

sockets = [socket1, socket2, socket3] # whatever
new_sockets = []
for sock in sockets:
    try:
        sock.close()
    except:
        new_sockets.append(sock)

sockets[:] = new_sockets

The reason for this is that removal of an item that is at an arbitrary location (e.g. by using sockets.remove) within the list of n items will have time complexity of O(n) on average, and if you end up removing k items, then the total complexity will be of O(kn), whereas constructing a new list and replacing the original with new will have time complexity of the scale of O(n), i.e. it doesn't depend on the number of sockets removed.

Or, as sockets are hashable, perhaps you should use a set to store them instead. Then, you need to either construct a new collection from the set to make it possible to iterate and remove at the same time (here I am using a list):

sockets = {socket1, socket2, socket3}
for sock in list(sockets):
    try:
        sock.close()
    except: 
        pass
    else:
        sockets.discard(sock)

Creating a new list has O(n) time complexity, but set.discard has only O(1) complexity.

Another way which gets rid of copying the data structure is to use another set for items that are to be removed:

sockets = {socket1, socket2, socket3}
to_remove = set()   # create an initially empty set
for sock in sockets:
    try:
        sock.close()
    except: 
        pass
    else:
        to_remove.add(sock)

# remove all sockets that are in to_remove in one 
# operation from sockets.
sockets.difference_update(to_remove)

This has a favourable running time over the other set example in case there are very few if any items to be removed.

Excellent ideas and thank you for giving an informed and educational answer! — std''OrgnlDave, Dec 27 '16 at 00:56

score 0 · Answer 2 · answered Dec 27 '16 at 09:35

This answer only adds a little explanation as to why you might want to assign the result to a slice of your original list. @Antti Happala has already given some nice alternatives for potentially improving the runtime.

When you write

sockets = newsockets

you are assigning the new list to the variable named sockets, whereas if you say

sockets[:] = newsockets

you are essentially replacing the entries of the same list with the new list.

The difference becomes more clear if you previously had another reference to your sockets list like so:

s = sockets

then, after assigning without slicing, s would still point to the old list with none of the sockets removed and sockets would refer to the new list.

Using the slicing version both s and sockets would refer to the same, updated list after removing some elements.

Usually this in-place replacement is what you want, however, there are also times when you might specifically want s and sockets to refer to different versions of the list, in which case you should not assign to a list slice. It all depends.

Python: Remove items from list while iterating, which have non-trivial removal conditions

2 Answers2