0

I need a specific regex pattern to find a URL in web pages in HTML

For example, I would like to search for this url: domainurl.com

and these are the URLs with tags

<a href="https://www.domainurl.com/refer/google-adsense/">fsdf</a>
<a title="Google Adsense" href="https://www.domainurl.com/refer/google-adsense/" target="_blank" rel="nofollow noopener">fgddf</a>
<a href="https://www.domainurl.com/page/pago">domain </a>

using this code regex

<a.*?[^>]* href="((https?:\/\/)?([\w\-])+\.{1}domainurl\.([a-z]{2,6})([\/\w\.-]*)*\/?)"

what congra get to get this label , I suppose to have target = "_ blank" rel = "nofollow noopener"

<a title="Google Adsense" href="https://www.domainurl.com/refer/google-adsense/" target="_blank" rel="nofollow noopener">fgddf</a>

Is there any regex code for target = "_ blank" and rel = "nofollow noopener" ??

this is what I have https://regexr.com/49hne

juan
  • 99
  • 1
  • 10

1 Answers1

1

For complete URL using positive lookbehind:

(?<=\<a.*?href=\")(.*?\..*?\.[a-z]+)

DEMO

Only domainurl.com using positive lookbehind:

(?<=\<a.*?www\.)([a-z]+\.[a-z]+)

DEMO2

For target = "_ blank" and rel = "nofollow noopener" :

DEMO3

target.*?\".*\"

For domainurl.com and target = "_ blank" and rel = "nofollow noopener" :

DEMO4

Mohammed Elhag
  • 4,272
  • 1
  • 10
  • 18