I need a regex to match "Google search"
from <a title="Google search" href="http://google.com">Google</a>
.
Here is the link to regexr.com.
I need it to only match for <a>
tags. I don't excel at regex, but I do know with JavaScript, look-behinds are impossible. I need it to somehow look-behind, and check if title=".+"
comes after <a>
.
Here are a few regular expressions that I put together:
This expression kinda works, but it picks up title=""
in <img>
. Also, it picks up title=
in <a>
, when I only want "Google search"
and "Microsoft home"
.
/((title=".+")(?=\s*href))|(title=".+")/igm;
These expressions removes the title=
like I want, but it also adds a \s
at the end.
/(?!title=)".+"\s+/igm; AND /(?!title)".+"\s+\b/igm;
In conclusion, given the above HTML, I want it to ONLY match "Google search"
and "Microsoft home"
(I don't want it to include the title=
nor match title="..."
in <img/>
)
EDIT:
This regular expression I was working on ONLY matches the first <a>
title:
/(?!<a\s+title\=)("[^"]+")(?=\s*href)/igm;