1

I have a text fragment:

.....https://www.one.com/privacy/\............http://two.com/terms/'.............https://three.com/pricing/\..........https://four.com/widget/wg74ythx;.........http://five.com/pricing .........

My code for extracting web links: link = re.compile(r'https?://(\w.*?)(\\|;|\'|\s)')

But I need to exclude from my results all links with the words "privacy" or "widget". I`m stuck here, and I need the help of the community.

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Manul
  • 11
  • 2

1 Answers1

0

If you don't need a compile object you could do something like

s = mystring urls = [url[0] for url in re.findall(r'https?://(\w.*?)(\\|;|\'|\s)',s) \ if not re.search('privacy|widget',url[0])]

Mose Wintner
  • 290
  • 1
  • 10