0

I would like to validate IP addresses from a list that may contain incorrectly formated addresses or other garbage. If the field does not contain a properly formated field, simply continue ignoring that field.

Per How to validate IP address in Python? it seems that there are two methods two accomplish this, REGEX or socket.inet_aton().

Below is an attempt to use socket.inet_aton() to parse a CSV and check the field if it is an IPv4 address. Currently it prints the garbage or not properly formatted IP addresses. Any tips on printing the inverse, or IP that are proper IP addresses?

Update

Numeric fields are not printing in discrete octet notation, i.e. 12345 prints. How could non-octet notation be filtered out?

for data in import_text('data.csv', ','):
    try:
        socket.inet_aton(data)
    except socket.error:
        continue
    print (data)
Community
  • 1
  • 1
Astron
  • 1,211
  • 5
  • 20
  • 42
  • 1
    What exactly is wrong that are you trying to get right? – kristaps Mar 30 '12 at 18:59
  • I would like the `print` statement to only return valid IP addresses. Currently, it only returns the invalid IP addresses or garbage. – Astron Mar 30 '12 at 19:05
  • Do you want to print it, or return it? Those are two totally different things. – kindall Mar 30 '12 at 19:24
  • Print, question update from feedback below though I now realize that `socket.inet_aton()` wiil match 111, 111.111, or 1111.111.111 111.111.111.111. Need to make sure it's a valid IP, not shorthand. – Astron Mar 30 '12 at 19:26
  • `12345` is a perfectly valid IP address, it's merely written as a single 32-bit integer rather than as discrete octets. – kindall Mar 30 '12 at 20:24
  • Then I need address in discrete octet form. – Astron Mar 30 '12 at 20:27

4 Answers4

3

The print statement is in the "except" block, so it is only invoked when there is an error parsing the passed string as IP address.

Change the contents of the for loop to this:

try:
    socket.inet_aton(data)
except socket.error:
    continue

print (data)
kristaps
  • 1,705
  • 11
  • 15
3

The else clause of a try/except block is executed if no exception occurred.

try:
    socket.inet_aton(data)
except socket.error:
    pass
else:
    print(data)

But since you require it expressed as a discrete octet, your best approach is not regex, not socket.inet_aton, but a simple validation function:

def valid_ip(addr):
    try:
        addr = addr.strip().split(".")
    except AttributeError:
        return False
    try:
        return len(addr) == 4 and all(octet.isdigit() and int(octet) < 256
                                      for octet in addr)
    except ValueError:
        return False

Then it's just:

if valid_ip(data):
    print data
kindall
  • 178,883
  • 35
  • 278
  • 309
1

you should print right after the inet_aton() call:

for data in import_text('data.csv', ','):
    try:
        socket.inet_aton(data)
        # data is ok, otherwise a socket.error would have been raised
        print(data)
    except socket.error:
        continue # if you don't care about "garbage"

whenever inet_aton is fed anything that is not a valid IP socket.error is raised, and control goes to the except block.

michele b
  • 1,825
  • 2
  • 20
  • 26
1

According to the manual, inet_aton accepts strings with less than 3 dots:

inet_aton() also accepts strings with less than three dots; see the Unix manual page inet(3) for details.

That might be part of what's happening to you here.

AKX
  • 152,115
  • 15
  • 115
  • 172