1

This problem feels basic and I must be overlooking something obvious. There are many posts on Stack related to this but nothing I have found quite covers this use case.

I have two lists - One with URLs and one with substrings:

list1 = ['https://', 'http://', 'woof://', 'meow://']
list2 = ['https://google.com', 'stackoverflow.com', 'meow://test.net', 'yahoo.com']

I want to create a third list where all the values from list1 are removed from list2.

For example - list3 = ['google.com', 'stackoverflow.com', 'test.net', 'yahoo.com']

I have tried:

for x in list1:
    for y in list2:
        if x in y:
            list3.append(y.replace(x, '')
        else:
            list3.append(y)

This creates a list with a lot of duplicates. I could probably add logic to clean list3 up but I feel as though there must be a much more pythonic way to do this.

I feel like this post is close to what I am looking for but not quite there.

Joe
  • 2,641
  • 5
  • 22
  • 43
  • 1
    You can use regex. `[re.sub("|".join(list1), "", x).strip(":/") for x in list2]` should work for your case- I'm sure there's a dupe... – pault May 20 '19 at 15:30
  • Note: you are removing more than just the substrings in `list1`. None of them have `"//"` but you are removing that anyway. Do you want to parse the strings in `list2` as URLs and remove the `://` prefix if it exists? – Code-Apprentice May 20 '19 at 15:34
  • @pault Not sure how that code works its over my head but its exactly what I wanted thanks!! If you post the answer I will accept – Joe May 20 '19 at 15:38
  • simplest way to not have duplicates is to use a set. – Andrew Allen May 20 '19 at 15:39
  • @AndrewAllen - I understand that but the issue is that doesn't help with dups like `google.com` and `https://google.com` – Joe May 20 '19 at 15:42

1 Answers1

0

You can use a comprehension with functools.reduce:

from functools import reduce
[reduce(lambda x, y: x.strip(y), list1, e) for e in list2]
['google.com', 'stackoverflow.com', 'test.net', 'yahoo.com']
Netwave
  • 40,134
  • 6
  • 50
  • 93