1

I am trying to write a code for finding out if the ipv4 address is correct using regular expressions and I can't seem to work out what's the problem .

import re
pattern=re.compile('([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]\.){3}([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])')
ip=['17.255.16.45','255.255.255.255','0.0.0.0','0.14.255.14','2555.2564.0.3','0.3.255']

for i in range (len(ip)):
    if re.search(pattern,ip[i]):
        print(ip[i],'ok')
    else:
        print(ip[i],"nope")
user
  • 1,220
  • 1
  • 12
  • 31
  • 3
    Check out this regex tester: https://regex101.com/ – glotchimo Nov 28 '19 at 18:53
  • This is a brave effort, but I think you may be putting too much load onto regular expressions here. If you want to be strict about the format, I would use regex just to make sure numbers do not start with a zero (except when it's only 0) and then check values are fine in code later (the question is still valid on its own though). – jdehesa Nov 28 '19 at 19:02
  • 1
    regexps are cool, and nice to learn - but this is reinveinting the wheel, and a "XY" problem: you need to validate the IP's strings, not to "fix the regexp" – jsbueno Nov 28 '19 at 19:33

3 Answers3

0

I think the problem is that your \. is included in the alternation, when it should always be included after any of the options before it. You can fix it by just putting those options within a pair of parentheses. Also, it is recommendable to use raw strings for regular expressions to avoid issues with escaping.

import re
pattern=re.compile(r'(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])')
ip=['17.255.16.45','255.255.255.255','0.0.0.0','0.14.255.14','2555.2564.0.3','0.3.255']

for i in range (len(ip)):
    if re.search(pattern,ip[i]):
        print(ip[i],'ok')
    else:
        print(ip[i],"nope")

Output:

17.255.16.45 ok
255.255.255.255 ok
0.0.0.0 ok
0.14.255.14 ok
2555.2564.0.3 nope
0.3.255 nope
jdehesa
  • 58,456
  • 7
  • 77
  • 121
0

I don't even know what was going wrong, but as soon as I'd refactored it into this it seemed to work:

import re

ip_num_pat = r"[0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]"

pattern = re.compile(r'(?:({0})\.){{3}}({0})'.format(ip_num_pat))
ip_addrs = [
    '17.255.16.45', '255.255.255.255', '0.0.0.0', '0.14.255.14',
    '2555.2564.0.3', '0.3.255']

for ip in ip_addrs:
    if pattern.match(ip):
        print(ip, 'ok')
    else:
        print(ip, 'nope')

In general it can be easier to keep track of things like this by breaking them up into smaller parts. I think probably the last block was wrong.

Also, note that I have changed your code to use .match, not .search. This is crucial, because otherwise you will match things like 01.2.3.4.

But, as others have said, a much easier approach would look something like this:

ip_addrs = [
    '17.255.16.45', '255.255.255.255', '0.0.0.0', '0.14.255.14',
    '2555.2564.0.3', '0.3.255', '03.1.2.3']

def is_ip(addr):
    try:
        component_strings = addr.split(".")
        if any(i.startswith("0") and i != "0" for i in component_strings):
            raise ValueError("Components cannot start with 0")
        components = [int(i) for i in component_strings]
        if len(components) != 4:
            raise ValueError("Need 4 parts for an IPv4 address")
        if any(not 0 <= i < 256 for i in components):
            raise ValueError("Components should be in range 0, ..., 255")
        return True
    except ValueError:
        return False

for ip in ip_addrs:
    if is_ip(ip):
        print(ip, 'ok')
    else:
        print(ip, 'nope')
Izaak van Dongen
  • 2,450
  • 13
  • 23
0

While working on the regexp may be interesting for educational purposes, if the intention is to put this into real code, you'd be better by using Python's ipaddress module - instead of reinventing the wheel.

It is part of Python standard library since Python 3.3, and all you need to do to use it is:

import ipaddress

# The "for ... in range(len(...))" pattern is not really needed in Python
# the native for can walk your sequence elements:
for address in ip:
    try:
       ipaddress.ip_address(address)
    except ValueError:
       print(address, "Nope")
    else:
       print (adress, "ok")

The obvious advantages, beyond subtle bugs in a regular expression are that it can also parse IPv6 addresses (and if those are not wanted, the protocol can easily be checked in the .version attribute) - the ip_address call above returns an object that gives, for free, a host of information on the IP, without any extra effort, including but not limited to:

 'is_link_local',
 'is_loopback',
 'is_multicast',
 'is_private',
 'is_reserved',
 'is_unspecified',
 'max_prefixlen',
jsbueno
  • 99,910
  • 10
  • 151
  • 209