0

Possible Duplicate:
Regular expression for URL validation (in JavaScript)

So I've seen many similar questions and answers but can't find a solution that fits my specific needs.

I'm terrible at Regex's and am struggling to get a simple Regex for the following url validation.

domain.com
domain.com/folder
subdomain.domain.com
subdomain.domain.com/folder

also to validate for optional http:// and http://www. would be super helpful. Thanks!

Community
  • 1
  • 1
mike
  • 111
  • 1
  • 7
  • 1
    What's to say `somedomain.sometld` is a valid URL? Be careful with this. The reason you're not finding much, is because this is a difficult problem with a tricky balance between linking to too much vs. too little. If it were me, I'd cast a wide net, and construct a valid URL, and hit those URLs server-side to see if they existed, prior to auto-linking. – Brad Jan 31 '12 at 04:57
  • *"can't find a solution that fits my specific needs"* ... How exactly do your needs differ from the millions of examples on the interwebs (including stack overflow) demonstrating how to validate URLs? –  Jan 31 '12 at 04:57
  • I think i can make this solution easier. How about a regex that doesn't allow say "more than one . or / between alphanumeric characters". eg. domain..com should fail. I also agree with Brad that there is no definitive solution since URL's have very broad rules. I'm just trying to find basic validation for entering urls. – mike Jan 31 '12 at 08:55

1 Answers1

0

As near as I can get would be:

/[a-z]+:\/\/(([a-z0-9][a-z0-9-]+\.)*[a-z][a-z]+|(0x[0-9A-F]+)|[0-9.]+)\/.*/

Note that your question hasn't limited URLs to a set of protocols, TLDs or character sets.

Something like skype://18005551212 or gopher://localhost is a valid URL. Heck, depending on what you're using to browse, the following might all be valid ways to get to the same server (though not quite the same virtualhost):

They all work for me in Firefox.

If you want further restrictions, determine WHAT they are. Are you willing to sacrifice valid protocols? Are you really only interested in one or two protocols?

A more specific question will get you a more specific answer.

Community
  • 1
  • 1
ghoti
  • 45,319
  • 8
  • 65
  • 104
  • thanks for the initial answer. Just common alphanumeric url's will be enough. This is just for customer input regarding their website. thanks. – mike Jan 31 '12 at 08:30
  • I also made a comment above about possibly making this regex simply. Maybe only check for at least 1 period, and don't allow more than 1 period or forward slash in a row. The truth is we've recommended our customers to ignore the http://,www parts but you never know what a user will enter. thanks again. – mike Jan 31 '12 at 18:05
  • Interesting. We always recommend leaving the http:// in place to insure proper matching of strings in email clients, blog software, etc. Leaving off "www" usually isn't a problem these days; most places will at least provide a redirect if the IP record on the domain itself is not the same as the web server. (Note that the Address record on the domain may also be used for SMTP if MX hosts can't be found.) – ghoti Jan 31 '12 at 18:24