0

I thought this task should be easy but I'm literally sitting here for hours and can't get it to work.

I have a 2 lists

whitelist = ['https://www.test','google.com','?image=']

all_urls = ['https://www.example.com','https://www.google.com','https://www.example.com','https://www.example.com','https://www.example.com','https://www.test.com','https://www.example.com','https://www.example.com?image=123456','https://www.test.com','https://www.example.com?image=lorem','https://www.example.com']

All I want to do is to pop every list element from all_urls that consists partially of one of the whitelist elements but everytime I loop thrugh it, it either skips some iterations or it is acting weird in some way.

I've mostly tried something like

for url in all_urls:
    for whitelist_element in whitelist:
        if whitelist_element in url:
            all_urls.remove(url)

but as I mentioned above it doesn't work and e.g.seems to only check and pop every second element of the all_urls list. Thanks for your help :-)

SERPY
  • 59
  • 6
  • 2
    `all_urls = [url for url in all_urls if not any(w in url for w in whitelist)]`… – deceze Sep 04 '20 at 15:05
  • 1
    Removing elements from a list whilst iterating over it will usually cause problems, because the elements will move down and cause the upstream indexes to change. You can fix that issue by working from back to front instead (i.e. do `for url in reversed(all_urls):`). – ekhumoro Sep 04 '20 at 15:11

0 Answers0