1

I am trying to write a regular expression that will match domains in a sentence.

I found this post which was very useful and helped me create the following to match domains, but it also unfortunately matches IP addresses too which I do not want:

((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})

I want to update my expression so that the following can still be found: in a sentence, between brackets, etc.:

www.example.com
subdomain.example.com
subdomain.example.co.uk

But not:

192.168.0.0
127.0.0.1

Is there a way to do this?

Twiggy
  • 95
  • 1
  • 7

2 Answers2

2

We could use a simple lookahead that excludes combinations of numbers and dots only: (?![\d.]+)

(?![\d.]+)((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})

Demo

wp78de
  • 18,207
  • 7
  • 43
  • 71
1

Answer from @wp78de is correct, however it would not detect the domains starting with Numerical digits i.e. 123reg.com

So remove the first group in the regex like this

((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})
Sahil
  • 1,959
  • 6
  • 24
  • 44