0

I have the following strings

http://google.com/q=search<p>dfgdfg</p> 
https://www.google.com
http://www.google.com
www.google.com

My regex looks like this :

/(((https?:\/\/)|(www\.))()[^\s]+)/g

How can I exclude the <p>dfgdfg</p>

from my search string so only the real urls will be chosen?

I know how to search for them but I want it in combination with my regex, so I need it in negation

<\/?(p)\b[^<>]*>

Here is a playground https://regex101.com/r/4OlCyb/1

user3369579
  • 486
  • 3
  • 7
  • 22
  • [Possible duplicate](https://stackoverflow.com/questions/406230/regular-expression-to-match-a-line-that-doesnt-contain-a-word) – Ders Feb 17 '21 at 06:49

1 Answers1

1

You were close. Simply add all chars to [^\s]+ that should not be in the link:

'http://google.com/?q=search<p>dfgdfg</p>'.match(/(?:https?:\/\/|www\.)[^\s<]+/)

Matches:

http://google.com/?q=search

You can add additional chars as needed to [^\s<]+

Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20