We have a piece of regex that adds a <strong>
tag around keywords if they are not within a certain closing tag themselves. This has always worked nicely...
foreach ($keywords as $keyword) {
$str = preg_replace("/(?!(?:[^<]+>|[^>]+(<\/strong>|<\/a>|<\/b>|<\/i>|<\/u>|<\/em>)))\b(" . preg_quote($keyword, "/") . ")\b/is", "<strong>\\2</strong>", $str, 1);
}
So if the keyword was test
this would change:
A test line
to:
A <strong>test</strong> line
but this would not change:
<a href="">A test line</a>
As you can see the list of closing tags we want it to ignore is in the regex.
We have encountered a problem with a string that looks like:
<a href="">A test <em>line</em></a>
It's not recognising the closing </a>
or </em>
for that matter, so it's coming out as...
<a href="">A <strong>test</strong> <em>line</em></a>
Which we don't want it to do. Can anyone see if there is a fix to this (and yes I am aware of the don't parse HTML with regex rule so posting links to that infamous post is not an answer ;-))