We use the following regular expression to convert URLs in text to links, which are shortened with ellipsis in the middle if they are too long:
/**
* Replace all links with <a> tags (shortening them if needed)
*/
$match_arr[] = '/((http|ftp)+(s)?:\/\/[^<>\s,!\)]+)/ie';
$replace_arr[] = "'<a href=\"\\0\" title=\"\\0\" target=\"_blank\">' . " .
"( mb_strlen( '$0' ) > {$maxlength} ? mb_substr( '$0', 0, " . ( $maxlength / 2 ) . " ) . '…' . " .
"mb_substr( '$0', -" . ( $maxlength / 2 ) . " ) : '$0' ) . " .
"'</a>'";
This is working. However, I found that if there is a link in the text already, like:
$text = '... <a href="http://www.google.com">http://www.google.com</a> ...';
it will match both URLs, so it will try to create two more <a>
tags, totally messing up the DOM of course.
How can I prevent the regex from matching if the link is already inside an <a>
tag? It will also be in the title
attribute, so basically I just want to skip every <a>
tag completely.