0

my friends, I would like to ask you for help with my regex requirements. I need regex validation for domains based on the conventions for the client in JS:

  • 63 characters. The 63 characters does not include what is referred to as the protocol identifier https:// or the domain extension (such as .com or .org on a subdomain)
  • no hyphen on the start and end of the string but it can be used in domains like text-example.com
  • no special characters are allowed
  • periods can be used when registering domain name however periods can be utilize in subdomains
  • domains can have numbers
  • client can provide multiple domains separated with semicolons - each domain should be validated separately

In the beginning, I thought it will be ok to have a separate regex for every point but I think it will time consuming operation. Could some help me with topic?

Omri Attiya
  • 3,917
  • 3
  • 19
  • 35

1 Answers1

0

My answer to the question "https://stackoverflow.com/questions/6449367/c-sharp-email-address-validation/6459786" might help you: https://stackoverflow.com/a/6459786/467473

It implements an email address validator in C#. You don't care about the "local part" (the bit left of the @ in email addresses, or the @ itself. The bit that's interesting to you is the fqdn production and that should be fairly straightforward to map to a Javascript regular expression.

On further consideration, it does sound like what you are describing is a DNS label, a single segment of a DNS name.

If what you want is to validate an RFC-compliant DNS label, then this regula expression ought to do you:

const rxDnsLabel = /^[A-Z]([A-Z0-9-]{0,61}[A-Z0-9])?$/i;

Breaking it down:

  • ^ — anchors the start of the match to start-of-text, followed by
  • [A-Z] — a US-ASCII letter, followed by
  • ( — an optional group, consisting of
    • [A-Z0-9-] — a US-ASCII letter, decimal digit or hyphen (-)
      • {0,61} — repeated 0 to 61 times, followed by
    • [A-Z0-9] a single US-ASCII letter or decimal digit
  • )? — the whole of which is optional
  • $ — anchors the end of the match to end-of-text

If you need to match a DNS name consisting of multiple labels, it's not that much more complicated. Just need to allow the optional extra segments:

const rxDnsLabel = /^([A-Z]([A-Z0-9-]{0,61}[A-Z0-9])?)([.][A-Z]([A-Z0-9-]{0,61}[A-Z0-9])?)*$/i;

The only difference here is that the initial label is allowed to be followed by zero or more additional labels, each separated with a period/full stop ( .).

Edited to note: If you require support for internationalized (punycode) domain names, I can't guarantee that this will match them mostly because I never needed to do so, and so haven't tested this against them. For details on internationalized (punycode) domain names, see the relevant RFs:

Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135