Best PHP Script for Clickable Links

Question

I've found many PHP script that convert urls in text to clickable links. But most of them don't work and some make big bugs. Some of them convert links that are already clickable. Others don't work and third makes parts from the text links. I need a script that will detect only links, not the text and will not convert the already clickable links because it's going on very ugly.

I found this code which seems the best from those I've tested. But it has some bugs. This code converts clickable links. Like this:

Original:

<a href="http://www.netload.in/dateiySgPP2b14W/1409423417ExpFut.pdf.htm" target="_blank">http://www.netload.in/dateiySgPP2b14W/1409...7ExpFut.pdf.htm</a>

Converted:

http://www.netload.in/dateiySgPP2b14W/1409423417ExpFut.pdf.htm" target="_blank">http://www.netload.in/dateiySgPP2b14W/1409...7ExpFut.pdf.htm

Here is the code:

function parse_urls($text, $maxurl_len = 35, $target = '_self') // Make URLs Clickable
{
    if (preg_match_all('/((ht|f)tps?:\/\/([\w\.]+\.)?[\w-]+(\.[a-zA-Z]{2,4})?[^\s\r\n\(\)"\'<>\,\!]+)/si', $text, $urls))
    {
        $offset1 = ceil(0.65 * $maxurl_len) - 2;

        $offset2 = ceil(0.30 * $maxurl_len) - 1;

        foreach (array_unique($urls[1]) AS $url)
        {
            if ($maxurl_len AND strlen($url) > $maxurl_len)
            {
                $urltext = substr($url, 0, $offset1) . '...' . substr($url, -$offset2);
            }
            else
            {
                $urltext = $url;
            }

            $text = str_replace($url, '<a href="'. $url .'" target="'. $target .'" title="'. $url .'">'. $urltext .'</a>', $text);
        }
    }

    return $text;
}

it might help if you tell us whats going wrong, and what the desired output should be. the converted link you posted doesn't look like it is what you want it to be, but then again, your question doesnt give much info on what SHOULD be happening. — Wouter, Jul 26 '12 at 16:08
@Wouter his question gives plenty info on what SHOULD be happening. He doesn't want the regex to catch links between `` tags. What I don't understand is what his intentions are: does he want us to help him fix this code, or is he asking for us to Google another parser for him? — Palladium, Jul 26 '12 at 16:15
You can see that the converted links are not correct. I have need from script that will detect only links in text format , not parts from the text and not to convert the already clickable links because it's going on very ugly — Tencho Tenchev, Jul 26 '12 at 16:17
@Palladium I showing this code because someone can try fix him. But if some one know already good and working code will be ok to replace this. — Tencho Tenchev, Jul 26 '12 at 16:20

Kyle · Accepted Answer · 2012-07-26T17:03:46.510

I just threw this together.

<?php
function replaceUrlsWithLinks($text){
    $dom = new DOMDocument;
    $dom->loadXML($text);
    $xpath = new DOMXpath($dom);
    $query = $xpath->query('//text()[not(ancestor-or-self::a)]');
    foreach($query as $item){
        $content = $item->textContent;
        if(preg_match_all('/((ht|f)tps?:\/\/([\w\.]+\.)?[\w-]+(\.[a-zA-Z]{2,4})?[^\s\r\n\(\)"\'<>\,\!]+)/si',$content,$matches,PREG_SET_ORDER | PREG_OFFSET_CAPTURE)){
            foreach($matches as $match){
                $newA = $dom->createElement('a',$match[0][0]);
                $newA->setAttribute('href',$match[0][0]);
                $newA->setAttribute('target','_blank');
                $a = $item->splitText($match[0][1]);
                $b = $a->splitText(strlen($match[0][0]));
                $a->parentNode->replaceChild($newA,$a);
            }
        }
    }
    return $dom->saveHtml();
}
// The HTML to process ...
$html = <<<HTML
<block>
<a href="http://google.com">http://google.com</a>
<b>Stuff http://google.com</b>
asdf http://google.com ffaa 
</block>
HTML;
// Process the HTML and echo it out.
echo replaceUrlsWithLinks($html);
?>

The output would be:

<block>
<a href="http://google.com">http://google.com</a>
<b>Stuff <a href="http://google.com" target="_blank">http://google.com</a></b>
asdf <a href="http://google.com" target="_blank">http://google.com</a> ffaa 
</block>

You shouldn't use regular expressions to manipulate HTML.

Hope this helps.

Kyle

-- Edit --

The previous code is more efficient, but if you plan to have two URLs in the same parent node, the code will break because the DOM tree is changed. To fix this, you can use this more intensive code:

<?php
function replaceUrlsWithLinks($text){
    $dom = new DOMDocument;
    $dom->loadXML($text);
    $xpath = new DOMXpath($dom);
    while(true){
        $shouldBreak = false;
        $query = $xpath->query('//text()[not(ancestor-or-self::a)]');
        foreach($query as $item){
            $shouldBreak = false;
            $content = $item->textContent;
            if(preg_match_all('/((ht|f)tps?:\/\/([\w\.]+\.)?[\w-]+(\.[a-zA-Z]{2,4})?[^\s\r\n\(\)"\'<>\,\!]+)/si',$content,$matches,PREG_SET_ORDER | PREG_OFFSET_CAPTURE)){
                foreach($matches as $match){
                    $newA = $dom->createElement('a',$match[0][0]);
                    $newA->setAttribute('href',$match[0][0]);
                    $newA->setAttribute('target','_blank');
                    $a = $item->splitText($match[0][1]);
                    $b = $a->splitText(strlen($match[0][0]));
                    $a->parentNode->replaceChild($newA,$a);
                    $shouldBreak = true;
                    break;
                }
            }
            if($shouldBreak == true)break;
        }
        if($shouldBreak == true){
            continue;
        }
        else {
            break;
        }
    }
    return $dom->saveHtml();
}

$html = <<<HTML
<block>
<a href="http://google.com">http://google.com</a>
<b>Stuff http://google.com</b>
asdf http://google.com ffaa  http://google.com
</block>
HTML;

echo replaceUrlsWithLinks($html);
?>

score 0 · Answer 2 · answered Jul 26 '12 at 16:35

0

this function wraps text like http://www.domain.com in an anchor tag. What I see here is that you are trying to convert an anchor tag to an anchor tag, which of course won't work. So: don't write the anchors in your text, and let the script create them for you.

answered Jul 26 '12 at 16:35

Dirk McQuickly

2,099
1
17
19

The script is automated and some teams have tags will be great to don't have but for unfortunately. Other way is to remove anchor tag. – Tencho Tenchev Jul 26 '12 at 16:39
Ok. I did not know that. When you want to go the way of removing the tags, beware of constructions like `click here`. You will have to fetch both the url and the link- text. – Dirk McQuickly Jul 26 '12 at 16:51

score 0 · Answer 3 · edited May 23 '17 at 12:29

0

You're running into the usual problems that happen when you try to parse HTML with regexes. You need a proper HTML parser. Have a look at this thread.

edited May 23 '17 at 12:29

Community

1
1

answered Jul 26 '12 at 16:45

dnagirl

20,196
13
80
123

Best PHP Script for Clickable Links

3 Answers3