0

I've simple form where url should be entered but i would like to use whatever is better either regx or anything else to check if it valid url or not.

I know there are zillion of questions about it already posted, But most of them are very old even before TLDs domains and other faild in case of ftp// and https//

i wish i can get answer that really able to cover such kind of url scheme

google.com
www.google.com
http//google.com
http//www.google.com
https://google.com
https://www.google.com
ftp://google.com

~ Thanks and once again sorry for posting duplicated question but just for getting updated answer as could be possible.

Mustofa Rizwan
  • 10,215
  • 2
  • 28
  • 43
Reham Fahmy
  • 4,937
  • 15
  • 50
  • 71

3 Answers3

3

This might not be a job for regexes, but for existing tools in your language of choice. Regexes are not a magic wand you wave at every problem that happens to involve strings. You probably want to use existing code that has already been written, tested, and debugged.

In PHP, use the parse_url function.

Perl: URI module.

Ruby: URI module.

.NET: 'Uri' class

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
2

The usage for parse_url() is below, but @wrikken brings up a much better way to simply validate if a URL is 'valid' or not with filter_var(). parse_url() simply parses a specified URL string into its component parts, and will apparently not return a false value unless the URL is catastrophically broken.

filter_var() is sensitive enough that it will detect something minor like an underscore used in a domain name.

var_dump(
  filter_var(
    'http://stack-overflow.com/questions/19437105/using-regx-how-to-validate-url?noredirect=1#comment28819663_19437105',
     FILTER_VALIDATE_URL
  )
);

//output: string(113) "http://stack-overflow.com/questions/19437105/using-regx-how-to-validate-url?noredirect=1#comment28819663_19437105"

var_dump(
  filter_var(
    'http://stack_overflow.com/questions/19437105/using-regx-how-to-validate-url?noredirect=1#comment28819663_19437105',
    FILTER_VALIDATE_URL
  )
);

//output: bool(false)

parse_url() would be better left to extracting portions of a URL that you already know is valid:

var_dump(parse_url('http://stackoverflow.com/questions/19437105/using-regx-how-to-validate-url?noredirect=1#comment28819663_19437105'));

Output:

array(5) {
  ["scheme"]=>
  string(4) "http"
  ["host"]=>
  string(17) "stackoverflow.com"
  ["path"]=>
  string(50) "/questions/19437105/using-regx-how-to-validate-url"
  ["query"]=>
  string(12) "noredirect=1"
  ["fragment"]=>
  string(24) "comment28819663_19437105"
}

Or how about:

Sammitch
  • 30,782
  • 7
  • 50
  • 77
0

Regexes are handy and expensive but for validating URLs:

^((ht|f)tp(s?)\:\/\/|~\/|\/)?([\w]+:\w+@)?([a-zA-Z]{1}([\w\-]+\.)+([\w]{2,5}))(:[\d]{1,5})?\/?(\w+\.[\w]{3,4})?((\?\w+=\w+)?(&\w+=\w+)*)?
revo
  • 47,783
  • 14
  • 74
  • 117