0

I have a form that I am validating with JS on the front-end and PHP on the server side. What I need is a way to reliably count the number of links in an HTML string. The best way that I could think of was to count the closing tags. However simply searching for this tag will not work because the user could circumvent the validation by adding spaces like so: </a >.

I am fairly new to regex and this is the pattern that I have been able to come up with so far:

<[ \n\t]*\/[ \n\t]*a[ \n\t]*>

In Javascript:

function link_count(s){
    return s.match(/<[ \n\t]*\/[ \n\t]*a[ \n\t]*>/g, s).length;
}

In PHP:

function count_links($str){
    return preg_match_all('<[ \n\t]*/[ \n\t]*a[ \n\t]*>', $str, $matches);
}

Is this the best approach? Will it affect the performance of my form (the html string could be very long)? I am looking for the most efficient and reliable solution.

Thanks in advance.

Steve
  • 20,703
  • 5
  • 41
  • 67
Hasan Akhtar
  • 251
  • 3
  • 8

1 Answers1

0

So, like @sgroves said, </a> are not all links. checking for href might be more interesting.
Also, why not checking the opening tag directly? I tried searching for <a .... href>

You might use the 's' modifier to ignore newlines...

/<\s*\ba\b.*?href/gs

http://regex101.com/r/bG8lN1/3

lcoderre
  • 1,304
  • 9
  • 16