Possible Duplicate:
convert url to links from string except if they are in a attribute of a html tag
I'm looking for help with a PHP Regex expression.
I'm creating an input page for an invitation-only blog, so I don't need to worry about spammers. I want to make it simple for people to add a URL, but I also want to make it possible for them to use HTML mark-up if that's what makes them happy.
In the example below, the $text variable contains three links. I want to create <a > ... </a>
tags around the first 2, but the third already has these tags, so I want to leave it alone. My regular expression works will for the last 2 cases, but not the first one.
My regex starts with [^<a href *?= *?\'\"]
which I want to mean "Don't create a match if the string start with <a href='>
(or similar), but that's not how it works in practice. Here, the ^
behaves as a "start of line" character not as a negator.
I would like the output to appear something like this:
Visit <a ...>http://www.example.com/</a> for more info.
<a ...>http://www.example.com/index.php?q=regex</a>
Here is a <i><a ...>link</a> to visit</i>.
Thanks in advance for any help with rewriting the regex.
James
<?php
$text = "Visit http://www.example.com/ for more info.
http://www.example.com/index.php?q=regex
Here is a <i><a href='http://www.google.ca/search?q=%22php+regex%22&hl=en'>link</a> to visit</i>.";
// Ignore fully qualified links but detect bare URLs...
$pattern = '/[^<a href *?= *?\'\"](ftp|https?):\/\/[\da-z\.-]+\.[a-z\.]{2,6}[\/\.\?\w\d&%=+-]*\/?/i';
// ... and replace them with links to themselves
$replacement = "<a href='$0'>$0</a>";
$output = preg_replace($pattern, $replacement, $text);
// Change line breaks to <p>...</p>...
$output = str_replace("\n", "", $output);
$output = "<p>".str_replace("\r", "</p><p>", $output)."</p>";
// Allow blank lines
$output = str_replace("<p></p>", "<p> </p>", $output);
// Split the paragraphs logically in the HTML
$output = str_replace("</p><p>", "</p>\r<p>", $output);
echo $output;
?>