This is a hacky solution but gathering how you are approaching this without worrying about character encoding, you probably just want the damn thing to work.
First, we convert hyperlinks into hacky BBCode. Then, we run htmlentities()
on it, lastly we replace the hacky A
BBCode with good old HTML. Have a look at this:
$foo = 'Opening quietly in Chicagos West Loop, the Inspire Business Center is looking to take a more active role in Chicagos startup scene … Continue reading <span class="meta-nav">→</span>';
echo smartencode($foo);
function smartencode($str) {
$tags = 'a|span';
// Convert Anchor Tags to hacky-BBCode
$ret = preg_replace('/\<(\/?('.$tags.').*)\>/U', '[$1]', $str);
// Remove so-called Garbage
$ret = preg_replace('/[^(\x20-\x7F)]*/','', $ret);
// $ret = htmlentities($ret, ENT_QUOTES | ENT_IGNORE, 'UTF-8');
// Reinstate Anchor tags in HTML
$ret = preg_replace('/\[(\/?('.$tags.').*)\]/U', '<$1>', $ret);
return $ret;
}
Again, it's not elegant. In fact if you look closely you could find some pitfalls for it - but I think it could just work for your use-case.
Tested on http://writecodeonline.com/php/ and worked as expected.