-1

I need a regular expression that would let me validate Url. I found this one:

"^(http(?:s)?\:\/\/[a-zA-Z0-9\-]+(?:\.[a-zA-Z0-9\-]+)*\.[a-zA-Z]{2,6}(?:\/?|(?:\/[\w\-]+)*)(?:\/?|\/\w+\.[a-zA-Z]{2,4}(?:\?[\w]+\=[\w\-]+)?)?(?:\&[\w]+\=[\w\-]+)*)$"

But my problem is that besides regular urls, I need to allow .cgi paths, without "http(s)", like this:

123.45.678.543:30/cgi-bin/blah/blah.cgi

Could you please help me figure out how I could add this to the expression above? Thanks.

David Shochet
  • 5,035
  • 11
  • 57
  • 105
  • Your example URL is invalid, but I think I've seen it on CSI Miami. The regex also rejects legitimate URLs with IPv6 literals. – Flexo Jul 02 '12 at 18:39
  • But it works for me somehow... I mean, for validating urls only. Can you tell what is wrong about it? – David Shochet Jul 02 '12 at 18:41
  • 1
    You can't have an IP address with an octet > 255. The point I was trying to make is that using a regex is really hard and there's lots of bad ones out there. Even if you make one that works someone could still give you an invalid domain or a link that returns 404/403 etc. - if you really care about seeing if a URL is valid make a call out. You can make a real HEAD request to check that it works. – Flexo Jul 02 '12 at 18:45
  • Oh, I thought you were talking about the Regex expressin. Yes, I just made up that IP address, so yes, it is not real. I think for me it would be enough to check if the format is valid, not if the address really exists. – David Shochet Jul 02 '12 at 18:49
  • 1
    You may want to take a look at a page I wrote: [Regular Expression URI Validation](http://jmrware.com/articles/2009/uri_regexp/URI_regex.html) Be sure to try double-clicking on some of the regex patterns to get language-specific code snippetts! Also, you may want to look at a specific URL validation function presented in [my answer](http://stackoverflow.com/a/5268056/433790) to the validation question cited earlier. – ridgerunner Jul 02 '12 at 19:07

1 Answers1

2

It does not seem to me that the RegExp you posted is general enough to validate any types of URLs. Have a look at What is the best regular expression to check if a string is a valid URL? (especially @eyelidlessness's answer) to maybe find more elaborate one.

The problem with your URL is not that it points to some cgi file. The problem is that it should - at least (in the case you are probably (?) talking about) - be

http://123.45.678.543:30/cgi-bin/blah/blah.cgi

Besides that the numbers in an IP address can be 255 at maximum. Even with these corrections it would still not qualify applying the RegExp you posted, but this is only one more hint to the faultiness of your RegExp.

Community
  • 1
  • 1
Jakob S.
  • 1,851
  • 2
  • 14
  • 29