1

I am looking to find all instances of a link within a string using Regex. I need to be able to just return a count of the links not the values them self.

Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!
<a href="www.google.co.uk">www.google.co.uk</a>
<a href="www.google.co.uk">www.google.co.uk</a>
<a href="www.google.co.uk">www.google.co.uk</a>

Thanks

Kieran Headley
  • 933
  • 1
  • 12
  • 21
  • 1
    what's your expected output for the above example? – Avinash Raj Jul 17 '14 at 08:31
  • Have you tried to do anything? – hindmost Jul 17 '14 at 08:31
  • 1
    You need to define "link" more clearly. Is it an anchor tag on the page? Is it a link to another document? What if it is simply text that is a URL but not a link? – Fluffeh Jul 17 '14 at 08:32
  • What is the number you'd expect from your example snippet? 1, 3 or 6? in case of 3 you should probably use `DOMDocument` with `DOMXPath` to query `//a[href]` instead of regex. – Kris Jul 17 '14 at 08:32

3 Answers3

4
$re = '/<a href=\\"([^\\"]*)\\">(.*)<\\/a>/iU'; 
$str = "Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!\n<a href=\"www.google.co.uk\">www.google.co.uk</a>\n<a href=\"www.google.co.uk\">www.google.co.uk</a>\n<a href=\"www.google.co.uk\">www.google.co.uk</a>"; 

preg_match_all($re, $str, $matches);

echo count($matches);

The above should capture all the links Which would return 3

t3chguy
  • 1,018
  • 7
  • 17
1

You can use a frontend method in JS to count the links or modify them. This method is for replacing links in a string, found at stackoverflow under Detect URLs in Text.

    function urlify(text) {
        var urlRegex = /(https?:\/\/[^\s]+)/g;
        return text.replace(urlRegex, function(url) {
            return '<a href="' + url + '">' + url + '</a>';
        })
        // or alternatively
        // return text.replace(urlRegex, '<a href="$1">$1</a>')
    }
    
    var text = "Find me at http://www.example.com and also at http://stackoverflow.com";
    var html = urlify(text);
    
    // html now looks like:
    // "Find me at <a href="http://www.example.com">http://www.example.com</a> and also at <a href="http://stackoverflow.com">http://stackoverflow.com</a>"

Here is the PHP variant to replace links

// The Regular Expression filter
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";

$matchCounter = 0;

// The Text you want to filter for urls
$text = "The text you want to filter goes here. http://google.com";

// Check if there is a url in the text
if(preg_match($reg_exUrl, $text, $url)) {
       // make the urls hyper links
       echo preg_replace($reg_exUrl, "<a href="{$url[0]}">{$url[0]}</a> ", $text);
       $matchCounter++;
} else {

       // if no urls in the text just return the text
       echo $text;
}

// Return $matchCounter for the matches in the string

Kind regards
Jan Biasi

Community
  • 1
  • 1
Jan Biasi
  • 187
  • 1
  • 5
1

I'd probably use something similar to this instead of regex:

$dom = new DOMDocument();
$dom->loadHTML($input);
$xpath = new DOMXPath($dom);
$links = $xpath->query('//a[href]');

printf("found %d links\n", $links->length);

foreach($links as $link)
{
    printf("link to: %s\n", $link->getAttribute('href'));
}

note that i typed this in the SO textbox, it may very well contain bugs.

Kris
  • 40,604
  • 9
  • 72
  • 101