0

Possible Duplicate:
convert url to links from string except if they are in a attribute of a html tag

I'm looking for help with a PHP Regex expression.

I'm creating an input page for an invitation-only blog, so I don't need to worry about spammers. I want to make it simple for people to add a URL, but I also want to make it possible for them to use HTML mark-up if that's what makes them happy.

In the example below, the $text variable contains three links. I want to create <a > ... </a> tags around the first 2, but the third already has these tags, so I want to leave it alone. My regular expression works will for the last 2 cases, but not the first one.

My regex starts with [^<a href *?= *?\'\"] which I want to mean "Don't create a match if the string start with <a href='> (or similar), but that's not how it works in practice. Here, the ^ behaves as a "start of line" character not as a negator.

I would like the output to appear something like this:

Visit <a ...>http://www.example.com/</a> for more info.

<a ...>http://www.example.com/index.php?q=regex</a>

Here is a <i><a ...>link</a> to visit</i>.

Thanks in advance for any help with rewriting the regex.

James

<?php
$text = "Visit http://www.example.com/ for more info.

http://www.example.com/index.php?q=regex

Here is a <i><a href='http://www.google.ca/search?q=%22php+regex%22&hl=en'>link</a> to visit</i>.";

// Ignore fully qualified links but detect bare URLs...
$pattern = '/[^<a href *?= *?\'\"](ftp|https?):\/\/[\da-z\.-]+\.[a-z\.]{2,6}[\/\.\?\w\d&%=+-]*\/?/i';

// ... and replace them with links to themselves
$replacement = "<a href='$0'>$0</a>";

$output = preg_replace($pattern, $replacement, $text);

// Change line breaks to <p>...</p>...
$output = str_replace("\n", "", $output);
$output = "<p>".str_replace("\r", "</p><p>", $output)."</p>";

// Allow blank lines
$output = str_replace("<p></p>", "<p>&nbsp;</p>", $output);

// Split the paragraphs logically in the HTML
$output = str_replace("</p><p>", "</p>\r<p>", $output);

echo $output;
?>
Community
  • 1
  • 1
James Newton
  • 6,623
  • 8
  • 49
  • 113
  • 1
    In your regex, `[^ – Blender Dec 25 '12 at 04:34
  • 1
    And possible duplicate of [Need a good regex to convert URLs to links but leave existing links alone](http://stackoverflow.com/questions/287144/need-a-good-regex-to-convert-urls-to-links-but-leave-existing-links-alone) * But see also: [regex keyword outside HTML tag – mario Dec 25 '12 at 04:36

0 Answers0