1

Duplicate: PHP validation/regex for URL

My goal is create a PHP regex for website name. The regex is for a lead gathering form and should accept any legit kind of website name syntax that someone might enter. After an exhaustive search, I'm surprised that I can't find one out there.

Here are the regex matches that I'm looking for:

AND, it should also match:

  • any of the above with a trailing backslash, such as: somewebsite.com/
  • subdomains
Community
  • 1
  • 1
edt
  • 22,010
  • 30
  • 83
  • 118

5 Answers5

9

No RegEx necessary.

$subject = 'example.com';
$part = (stripos($subject, 'http://') === FALSE)  ? 'http://' : '' ;
var_dump(filter_var($part.$subject, FILTER_VALIDATE_URL));
Milan Babuškov
  • 59,775
  • 49
  • 126
  • 179
mandaleeka
  • 6,577
  • 1
  • 27
  • 34
3

You might need to tweak it:

<?php

$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&amp;?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';

$url1  = "http://www.somewebsite.com";
$url2  = "https://www.somewebsite.com";
$url3  = "https://somewebsite.com";
$url4  = "www.somewebsite.com";
$url5  = "somewebsite.com";

function valURL($pattern, $url) {

        $return = false;

        if(preg_match($pattern, $url)) {
                $return = true;
        }

        if($return == true) {
                echo "Match URL: <font color='green'>" . $url . "</font><br /><br />";
        } else {
                echo "Try Again: <font color='red'>URL: " . $url . "</font><br /><br />";
        }
}

valURL($pattern, $url1);
valURL($pattern, $url2);
valURL($pattern, $url3);
valURL($pattern, $url4);
valURL($pattern, $url5);

?>
Phill Pafford
  • 83,471
  • 91
  • 263
  • 383
  • @PhillPafford I linked your answer in this question: http://stackoverflow.com/a/23567981/976775 Thank you for this Regexp! – MrYoshiji May 09 '14 at 15:07
3

I decided to benchmark the answers here to prove that regular expressions are not the answer for such simple tasks. Andy Leekman's code is whole 30% to 60% quicker than other answers. He did have a bug, but I fixed that with a line of code. You can view my results below.

Here's the code on which the tests ran.

http://pastie.org/476900

alt text http://img254.imageshack.us/img254/7821/capturevzh.png

PS If anyone elses uses a regular expression to validate a URL I might go mad ;)

The Pixel Developer
  • 13,282
  • 10
  • 43
  • 60
-1
/^([a-z0-9]([-a-z0-9]*[a-z0-9])?\\.)+((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdefghijmnorstvwyz]|biz)|(c[acdfghiklmnorsuvxyz]|cat|com|coop)|d[ejkmoz]|(e[ceghrstu]|edu)|f[ijkmor]|(g[abdefghilmnpqrstuwy]|gov)|h[kmnrtu]|(i[delmnoqrst]|info|int)|(j[emop]|jobs)|k[eghimnprwyz]|l[abcikrstuvy]|(m[acdghklmnopqrstuvwxyz]|mil|mobi|museum)|(n[acefgilopruz]|name|net)|(om|org)|(p[aefghklmnrstwy]|pro)|qa|r[eouw]|s[abcdeghijklmnortvyz]|(t[cdfghjklmnoprtvwz]|travel)|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw])$/i

http://www.shauninman.com/archive/2006/05/08/validating_domain_names

Courtesy of google. It is VERY complex though, so someone else might have a simpler one.

EDIT: Try andy's answer first. If you can find an alternative to a regex, 9/10 the alternative is much better.

Macha
  • 14,366
  • 14
  • 57
  • 69
-1
^(https?://)?(([0-9a-z_!'().&=$%-]: )?[0-9a-z_!'().&=$%-]@)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-z_!'()-]\.)([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\.[a-z]{2,6})(:[0-9]{1,4})?((/?)|(/[0-9a-z_!*'().;?:@&=$,%#-])/?)$
Phill Pafford
  • 83,471
  • 91
  • 263
  • 383
  • I can't get your code to work. Can you provide a simple usage example? – edt May 11 '09 at 16:42
  • $pattern = /^(https?://)?(([0-9a-z_!'().&=$%-]: )?[0-9a-z_!'().&=$%-]@)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-z_!'()-]\.)([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\.[a-z]{2,6})(:[0-9]{1,4})?((/?)|(/[0-9a-z_!*'().;?:@&=$,%#-])/?)$/ – Phill Pafford May 11 '09 at 16:50
  • Sorry, but still not working for me. This is what I am trying. Any suggestion? $some_url = 'http://some-url.com'; $pattern = "/^(https?://)?(([0-9a-z_!'().&=$%-]: )?[0-9a-z_!'().&=$%-]@)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-z_!'()-]\.)([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\.[a-z]{2,6})(:[0-9]{1,4})?((/?)|(/[0-9a-z_!*'().;?:@&=$,%#-])/?)$/"; if(preg_match($pattern, $some_url)) { echo "valid"; } else { echo "invalid"; } – edt May 11 '09 at 17:38