for loop, and url.remove() only affecting every other entry

Question

Getting some confusing behaviour when running a for loop and removing entries from a list (cleaning out invalid urls):

urls = ['http://a.com/?mail=a@b.com','mailto:a@a.com', 'mailto:a@b.com', 'mailto:a@c.com', 'mailto:a@d.com']

for s in urls:
    if '@' in s and '?' not in s:
        urls.remove(s)

print(urls)

The output is:

['mailto:a@b.com', 'mailto:a@d.com']

It is consistently every other entry, so I'm assuming my understanding of python is not correct.

I looked into list comprehension with Python and ended up with:

urls = [s for s in urls if not ('?' not in s and '@' in s)]

This does what I want it to.

Is that the best way, can someone explain the behaviour, because I don't get it.

Thanks

Sorry, expected output is:['http://a.com/?mail=a@b.com'], formatting is strange on comments, to be clear, the first entry in the list should remain as is. — Stuart Grierson, Nov 04 '18 at 16:20

score 2 · Answer 1 · answered Nov 04 '18 at 16:23

2

The problem with your first solution is that you iterate over an object while deleting entries from it. The topic is discussed here for example: How to remove items from a list while iterating?

answered Nov 04 '18 at 16:23

Gregor

588
1
5
19

Thanks, that is now clear - I just couldn't find it. – Stuart Grierson Nov 04 '18 at 16:25

Austin · Answer 2 · 2018-11-04T16:23:53.293

0

If you are trying to remove from list while iterating over, take a copy and iterate. urls[:] takes a copy of urls and you iterate over that. This prevents some unexpected situations that occur when iterating through the original list:

urls = ['http://a.com/?mail=a@b.com','mailto:a@a.com', 'mailto:a@b.com', 'mailto:a@c.com', 'mailto:a@d.com']

for s in urls[:]:
    if '@' in s and '?' not in s:
        urls.remove(s)

print(urls)

But, I would rather prefer the list-comprehension version of yours, that's more concise and pythonic.

edited Nov 04 '18 at 16:23

answered Nov 04 '18 at 16:23

Austin

25,759
4
25
48

I believe this is a good solution but you should explain it. – Cole Nov 04 '18 at 16:23
@Cole, How about now? I was in the process of writing explanation. :) – Austin Nov 04 '18 at 16:28
That's a cool tip, and useful when I need to do a bit more than just remove based on characters - and still having code I can actually read (regardless of how pythonic it might be) :) – Stuart Grierson Nov 04 '18 at 16:52

for loop, and url.remove() only affecting every other entry

2 Answers2