0

I'm trying to convert plain links to HTML links using preg_replace. However it's replacing links that are already converted.

To combat this I'd like it to ignore the replacement if the link starts with a quote.

I think a positive lookahead may be needed but everything I've tried hasn't worked.

$string = '<a href="http://www.example.com">test</a> http://www.example.com';

$string = preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "<a href=\"$1\">$1</a>", $string);

var_dump($string);

The above outputs:

<a href="<a href="http://www.example.com">http://www.example.com</a>">test</a> <a href="http://www.example.com">http://www.example.com</a>

When it should output:

<a href="http://www.example.com">test</a> <a href="http://www.example.com">http://www.example.com</a>

Steven
  • 11

3 Answers3

0

The Idea

You can split your string at the already existing anchors, and only parse the pieces in between.

The Code

$input = '<a href="http://www.example.com">test</a> http://www.example.com';

// Split the string at existing anchors
// PREG_SPLIT_DELIM_CAPTURE flag includes the delimiters in the results set
$parts = preg_split('/(<a.*?>.*?<\/a>)/is', $input, PREG_SPLIT_DELIM_CAPTURE);

// Use array_map to parse each piece, and then join all pieces together
$output = join(array_map(function ($key, $part) {

    // Because we return the delimiter in the results set,
    // every $part with an uneven key is an anchor.
    return $key % 2
        ? preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "<a href=\"$1\">$1</a>", $part)
        : $part;
}, array_keys($parts), $parts);
Kuba Birecki
  • 2,926
  • 1
  • 13
  • 16
  • This kind of works but the issue is that the text "test" is replaced by the link's URL. – Steven May 08 '16 at 16:06
  • I updated my answer. By splitting your input string up in pieces, you can leave the existing anchors as they are, and only parse what's in between. – Kuba Birecki May 08 '16 at 16:33
0

You might get along with lookarounds. Lookarounds are zero-width assertions that make sure to match/not to match anything immediately around the string in question. They do not consume any characters.
That being said, a negative lookbehind might be what you need in your situation:

(?<![">])\bhttps?://\S+\b

In PHP this would be:

<?php
$string = 'I want to be transformed to a proper link: http://www.google.com ';
$string .=  'But please leave me alone ';
$string .= '(<a href="https://www.google.com">https://www.google.com</a>).';

$regex = '~                # delimiter
              (?<![">])    # a neg. lookbehind
              https?://\S+ # http:// or https:// followed by not a whitespace
              \b           # a word boundary
          ~x';             # verbose to enable this explanation.
$string = preg_replace($regex, "<a href='$0'>$0</a>", $string);
echo $string;
?>

See a demo on ideone.com. However, maybe a parser is more appropriate.

Community
  • 1
  • 1
Jan
  • 42,290
  • 8
  • 54
  • 79
0

Since you can use Arrays in preg_replace, this might be convenient to use depending on what you want to achieve:

        <?php

        $string = '<a href="http://www.example.com">test</a>    http://www.example.com';
        $rx     = array("&(<a.+https?:\/\/[\w]+[^ \,\"\n\r\t<]*>)(.*)(<\/a\>)&si", "&(\s){1,}(https?:\/\/[\w]+[^ \,\"\n\r\t<]*)&");
        $rp     = array("$1$2$3", "<a href=\"$2\">$2</a>");
        $string = preg_replace($rx,$rp, $string);

        var_dump($string);

        // DUMPS:
        // '<a href="http://www.example.com">test</a><a href="http://www.example.com">http://www.example.com</a>'
Poiz
  • 7,611
  • 2
  • 15
  • 17