-3

I would very much appreciate if any one could explain what is wrong with me regex.I tried it on a regex generator works perfectly but while compiling on my laptop it prints out None. i am given a html link and i would like to identify its href(reference)Here is the regex

r"(?<=\=\").{1,}(?=\W+?\s[t])"

example:

<li id="n-mainpage-description"><a href="/wiki/Main_Page" title="Visit the main page [z]" accesskey="z">Main page</a></li>

error:

MonkeyZeus
  • 20,375
  • 4
  • 36
  • 77

1 Answers1

1

You can make use of negative lookbehind to get the contents of an href:

(?<=href=\")[^\"]+
  • (?<=href=\") - make sure an href=" precedes my current position
  • [^\"]+ - capture everything which is not a double quote

https://regex101.com/r/NDVDNB/1

MonkeyZeus
  • 20,375
  • 4
  • 36
  • 77