0

I want to validate domain names using Regex, here is my regex so far:

/^(([a-zA-Z]{1})|([a-zA-Z]{1}[a-zA-Z]{1})|([a-zA-Z]{1}[0-9]{1})|([0-9]{1}[a-zA-Z]{1})|([a-zA-Z0-9][a-zA-Z0-9-]{1,61}[a-zA-Z0-9])\.[a-zA-Z]{2,3})$/;

The above line would also allow multiple dashes in the address, like this--is.com which is obviously wrong, how this could be fixed?

I need the domain in the most simple way, like google.com to be validated only, no protocols, sub domain etc is allowed.

So these will be ok:

google.com
yahoo-new.com
newdomain.travel

And not OK:

http://google.com
www.yahoo.com
http://www.blah.com
behz4d
  • 1,819
  • 5
  • 35
  • 59
  • possible duplicate of [Regular Expression for validating DNS label ( host name)](http://stackoverflow.com/questions/2063213/regular-expression-for-validating-dns-label-host-name) – Barmar Apr 19 '14 at 18:02
  • @Barmar have you read the question? it has nothing to do with that.. – behz4d Apr 19 '14 at 18:03
  • The answer there provides the regexp for everything between the `.` characters. So all you have to do is allow repetition with `.` between them. – Barmar Apr 19 '14 at 18:05
  • @Barmar how about strings starting with `dash`!? it's not a duplicate, I've already read that – behz4d Apr 19 '14 at 18:07
  • Why do you think multiple (successive) dashes are "obviously wrong"? – Bergi Apr 19 '14 at 18:07
  • @Bergi Am I wrong? `te----------st.com` is a valid domain? – behz4d Apr 19 '14 at 18:08
  • Your regexp doesn't allow strings beginning with numbers. – Barmar Apr 19 '14 at 18:08
  • @Barmar just tested `13-f.com` which it allowed – behz4d Apr 19 '14 at 18:10
  • Your regexp is very confusing, I don't understand why you have so many different alternatives for the first and second characters. – Barmar Apr 19 '14 at 18:15
  • And you could simplify it a lot if you used the `i` modifier to make it case insensitive, instead of repeating `A-Za-z` throughout. – Barmar Apr 19 '14 at 18:15
  • @behz4d: While is definitely uncommon (unless in [IDNA](https://en.wikipedia.org/wiki/Internationalized_domain_name#ToASCII_and_ToUnicode)), [RFC 1035](https://tools.ietf.org/html/rfc1035#page-8) does allow them. I only found that requirement in [this question](http://stackoverflow.com/q/20954756/1048572) as well, where it was commented on to be wrong. – Bergi Apr 19 '14 at 18:26
  • @Bergi I'm surfing the internet for almost a decade, and have'nt seen a single domain with repeated dashes, that's why I say it's obviously wrong... have you seen any!? – behz4d Apr 19 '14 at 18:34
  • @behz4d: While I'm surfing the internet longer than you, I haven't seen a domain that consists largely of digits, yet *it is still valid*. Btw, an example for dashes: http://xn--h-0gaa.idn.swznet.de (german test site, with an `ö` in the name) – Bergi Apr 19 '14 at 18:43

1 Answers1

1

behz4d, here is a simple expression that does what you want. But we may want to tweak it (see below.)

^[a-zA-Z]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]{2,6}$

It matches

google.com
yahoo-new.com
newdomain.travel
this-is-it.com

But not

this--it.com [per your requirement]
http://google.com
www.yahoo.com
http://www.blah.com

Javascript does not support (?i) to turn case-insensitivity inline, so I specified fully specified the letters as [a-zA-Z]. Another option is to turn on case-insensitivity in the regex call.

Please note that the {2,6} at the end means that we only match TLDs that have 6 characters at the most, to allow your "travel" TLD. Originally, you had {2,3}, which would allow "com" but not "travel". However, there are longer TLDs, and I would suggest either going to something longer such as {2,20} or simply not limiting the TLD size:

^[a-zA-Z]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]+$

Also, originally, you did not allow for digits in the first character. But digits are allowed. So you could revise that to

^[a-zA-Z\d]+(?:-?[a-zA-Z\d])+\.[a-zA-Z]+$
zx81
  • 41,100
  • 9
  • 89
  • 105