2

how should be valid domain name regex which full fill following criteria.

  1. each label max 63 characters long minimum 1 characters
  2. contains numbers, letters and '-', But
  3. should not start and end with '-'
  4. max domain name length 255 characters minimum 1.

for example

some of valid combinations:

a
a.com
aa-bb.b

I created this ^(([a-z0-9]){1,63}\.?){1,255}$

But currently its not validating '-' part as required (it's , missing)

Is there any way?

plz correct me if I am wrong.

Nikhil Rupanawar
  • 4,061
  • 10
  • 35
  • 51

8 Answers8

3

and mandatory to end with '.' : Here i found the solution

"^(((([A-Za-z0-9]+){1,63}\.)|(([A-Za-z0-9]+(\-)+[A-Za-z0-9]+){1,63}\.))+){1,255}$"
Nikhil Rupanawar
  • 4,061
  • 10
  • 35
  • 51
  • It doesn't have to end with a period. Mind explaining? A period normally comes in the last 2-4 characters of the domain, before the domain extension. – User Aug 18 '14 at 16:59
  • Yes, It is optional to have period at end. Needs improvement accordingly. – Nikhil Rupanawar Aug 19 '14 at 10:16
  • 1
    I decided to go with this: http://stackoverflow.com/questions/2532053/validate-a-hostname-string – User Aug 19 '14 at 20:22
2

This expression should meet all the requirements: ^(?=.{1,255}$)(?!-)[A-Za-z0-9\-]{1,63}(\.[A-Za-z0-9\-]{1,63})*\.?(?<!-)$

  • uses lookahead for total character length
  • domain can optionally end with a .
Steve Goossens
  • 968
  • 1
  • 8
  • 16
2

You can use a library, e.g. validators. Or you can copy their code:

Installation

pip install validators

Usage

import validators
if validators.domain('example.com')
    print('this domain is valid')

In the unlikely case you find a mistake, you can fix and report the error.

toto_tico
  • 17,977
  • 9
  • 97
  • 116
1

Maybe this:

^(([a-zA-Z0-9\-]{1,63}\.?)+(\-[a-zA-Z0-9]+)){1,255}$
adam
  • 238
  • 4
  • 14
0

Don't use regex for parsing domain names, use urllib.parse.

If you need to find valid domain names in HTML then split the text of the page with a regex [ <>] and then parse each resulting string with urllib.parse.

piokuc
  • 25,594
  • 11
  • 72
  • 102
  • 4
    urllib.parse will not ensure a valid domain name. the `netloc` could contain "localhost" or a false-positive of a malformed url ( e.g. "http://example", "http://malformed" ) – Jonathan Vanasco Jul 11 '14 at 23:08
0

Use the | operator in your RE followed by the '-'.. ensure you escape the literal '-' with \

user2878309
  • 131
  • 1
  • 6
0

Instead of using regex try to look at urlparse

https://docs.python.org/3/library/urllib.parse.html

It's fairly simple to learn and a lot better and comfortable to use.

Dropout
  • 13,653
  • 10
  • 56
  • 109
-1

Try this:

^(([a-z0-9]\-*[a-z0-9]*){1,63}\.?){1,255}$
Arash Hatami
  • 5,297
  • 5
  • 39
  • 59
juankysmith
  • 11,839
  • 5
  • 37
  • 62