0

Is it possible to check a string with a single regex and validate it if matches valid IPv4 or IPv6 address or even a hostname with no special characters but dot (.)?

I have separate regex for IPv4 and IPv6 but how do I combine them to do the work described above?

Thanks in advance,

Dan

dan
  • 101
  • 1
  • 4
  • 12
  • 1
    Just do: `(regexIPv4)|(regexIPv6)` – Bart Kiers Apr 17 '12 at 12:42
  • That's gotta be a *huge* regex, at least if it matches every valid hostname and rejects every invalid one. – Niklas B. Apr 17 '12 at 12:42
  • Thanks Bart, but what about hostnames? – dan Apr 17 '12 at 12:44
  • You might want to check out http://www.java2s.com/Code/Java/Regular-Expressions/RegexforIPv6Address.htm in conjunction with what @BartKiers said. Also, [this question](http://stackoverflow.com/questions/106179/regular-expression-to-match-hostname-or-ip-address). – Gareth Latty Apr 17 '12 at 12:44
  • possible duplicate of [ip address validation in python using regex](http://stackoverflow.com/questions/10086572/ip-address-validation-in-python-using-regex) – Steven Rumbalski Apr 17 '12 at 12:45
  • @Niklas I want to keep the hostname one simple with only regular charatcters. I think it's more challenging to combine the separate regex without breaking the each other e.g. I don't want the hostname regex to break the ipv6 part – dan Apr 17 '12 at 12:46
  • @StevenRumbalski That isn't a duplicate. This asks for IPv4, 6 and hostnames. – Gareth Latty Apr 17 '12 at 12:47
  • 1
    @dan: Just use alternation: `|` if you don't care about proper matching. If you do care, don't use regex for the job. – Niklas B. Apr 17 '12 at 12:47
  • @Lattyware: Understood. However, between http://stackoverflow.com/questions/106179/regular-expression-to-match-hostname-or-ip-address and http://stackoverflow.com/questions/319279/how-to-validate-ip-address-in-python the question need not have been asked. – Steven Rumbalski Apr 17 '12 at 12:55

3 Answers3

3

You could use a single regex, but it's going to be ugly as hell. Either;

  • Create separate regexes as strings, then combine them. Far more legible. Or,
  • Test each regex separately. Also much clearer.

Perl-ish example:

if ( $foo =~ /$ipv4_re/ or $foo =~ /$ipv6_re/ or $foo =~ /$hostname_re/ ) {
    ...
}

Having said that, there are probably libraries in Python that will validate these things for you, and personally I'd rather rely on them.

Rory Hunter
  • 3,425
  • 1
  • 14
  • 16
0

Try:

(?:\d{1,3}\.){3}\d{1,3}|                    (?# IPv4 address)
[:a-fA-F0-9]*:[:a-fA-F0-9]*:[:a-fA-F0-9.]*| (?# IPv6 address)
[-a-z0-9A-Z]+\.[-a-z0-9A-Z]*                (?# domain name)

Of course, you're free to substitute the individual expressions with more complex ones.

phihag
  • 278,196
  • 72
  • 453
  • 469
0
(?P<ihost>\\[(?:(?:[0-9A-F]{1,4}:){6}(?:[0-9A-F]{1,4}:[0-9A-F]{1,4}|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))|::(?:[0-9A-F]{1,4}:){5}(?:[0-9A-F]{1,4}:[0-9A-F]{1,4}|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))|[0-9A-F]{1,4}?::(?:[0-9A-F]{1,4}:){4}(?:[0-9A-F]{1,4}:[0-9A-F]{1,4}|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))|(?:(?:[0-9A-F]{1,4}:)?[0-9A-F]{1,4})?::(?:[0-9A-F]{1,4}:){3}(?:[0-9A-F]{1,4}:[0-9A-F]{1,4}|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))|(?:(?:[0-9A-F]{1,4}:){,2}[0-9A-F]{1,4})?::(?:[0-9A-F]{1,4}:){2}(?:[0-9A-F]{1,4}:[0-9A-F]{1,4}|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))|(?:(?:[0-9A-F]{1,4}:){,3}[0-9A-F]{1,4})?::(?:[0-9A-F]{1,4}:)(?:[0-9A-F]{1,4}:[0-9A-F]{1,4}|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))|(?:(?:[0-9A-F]{1,4}:){,4}[0-9A-F]{1,4})?::(?:[0-9A-F]{1,4}:[0-9A-F]{1,4}|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)))|(?:(?:[0-9A-F]{1,4}:){,5}[0-9A-F]{1,4})?::[0-9A-F]{1,4}|(?:(?:[0-9A-F]{1,4}:){,6}[0-9A-F]{1,4})?::|v[0-9A-F]+\\.(?:[a-zA-Z0-9_.~-]|[!$&'()*+,;=]|:)+)\\]|(?:(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))|(?:(?:[a-zA-Z0-9._~-]|[\xa0-\ud7ff\uf900-\ufdcf\ufdf0-\uffef\U00010000-\U0001fffd\U00020000-\U0002fffd\U00030000-\U0003fffd\U00040000-\U0004fffd\U00050000-\U0005fffd\U00060000-\U0006fffd\U00070000-\U0007fffd\U00080000-\U0008fffd\U00090000-\U0009fffd\U000a0000-\U000afffd\U000b0000-\U000bfffd\U000c0000-\U000cfffd\U000d0000-\U000dfffd\U000e1000-\U000efffd])|%[0-9A-F][0-9A-F]|[!$&'()*+,;=])*)

(from rfc3987)

Daniel Gerber
  • 129
  • 1
  • 3