1

I need a regex to match "Google search" from <a title="Google search" href="http://google.com">Google</a>.

Here is the link to regexr.com.

I need it to only match for <a> tags. I don't excel at regex, but I do know with JavaScript, look-behinds are impossible. I need it to somehow look-behind, and check if title=".+" comes after <a>.

Here are a few regular expressions that I put together:

This expression kinda works, but it picks up title="" in <img>. Also, it picks up title= in <a>, when I only want "Google search" and "Microsoft home".

/((title=".+")(?=\s*href))|(title=".+")/igm;

enter image description here

These expressions removes the title= like I want, but it also adds a \s at the end.

/(?!title=)".+"\s+/igm; AND /(?!title)".+"\s+\b/igm;

enter image description here

In conclusion, given the above HTML, I want it to ONLY match "Google search" and "Microsoft home" (I don't want it to include the title= nor match title="..." in <img/>)


EDIT:

This regular expression I was working on ONLY matches the first <a> title:

/(?!<a\s+title\=)("[^"]+")(?=\s*href)/igm;

enter image description here

Matthew
  • 2,158
  • 7
  • 30
  • 52
  • 3
    [A must read](http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript). But why don't you simply use the DOM? – HamZa Jul 01 '14 at 21:44
  • so you want to select the whole link ? would you please post your html as well ? and i prefer https://www.debuggex.com/ for testing. may you try your testings here. – Dwza Jul 01 '14 at 21:46

1 Answers1

0

This regex:

/<a[^>]+title=(["'])(Google search|Microsoft home)\1/ig

Captures ONLY Google search or Microsoft home in a tags. The match includes the tags. Don't fret! We captured the "Google Search" in the second capture group. You can access it in javascript with \2 or $2.

mareoraft
  • 3,474
  • 4
  • 26
  • 62