0

I am trying to get a match on a tags which have some text preceding it few samples :

<p> some text here <a href="#">and here</a></p> <!--want match--><br/>
<p> some text here and number 55 <a href="#">and here</a>  </p> <!--want match--><br/>
<p>  <a href="#">and here</a></p> <!--do not want match--><br/>

Now when I use this regex

>[\w,.-_|]+<a (.*?)</a>\s*<

I do not get a match on any of these. However, this regex

>[\s\w,.-_|]+<a (.*?)<\/a>\s*<

gives match on all 3, where I want only first 2 as match

The problem here is "\s" whitespace. I dont mind whitespace between text, but if there is only whitespace and no text, there should be no match.

How can I do that?

Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331
maX
  • 788
  • 2
  • 11
  • 32
  • **Don't**: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Jan Feb 12 '16 at 07:42

1 Answers1

0

I'm not entirely sure what you're after but this will match the first 2 only:

>[^>]*\w[^>]*(<a [^<]*<\/a>)

This matches any anchor tag that is preceded by a space that is preceded by some literal text. Please clarify the question with some more examples if this isn't the intended result.

edit: removed redundant \s before group edit: changed .* to [^>] to not skip tags

Ed'
  • 397
  • 2
  • 7