Here is a non-regular-expression solution which can provide more accurate diagnostics if you care about it, and will be more precise than what you had for the IP addresses. This will only be taking the whole line though, which may not be what you want.
You're wanting to match strings like this: id XXX.XXX.XXX.XXX, data XXX.XXX.XXX.XXX, Type Transit XX
(with variable whitespace in most places).
def extract_ip_addresses(line):
'''
Extract the 'id' and 'data' IP addresses from lines of the form::
' id X.X.X.X, data X.X.X.X, Type Transit X'
The number following Type Transit must be a number less than 100 but is not returned.
Whitespace is flexible.
'''
try:
(id_, id), (data_, data), (type_, transit_, type_transit) = [s.split() for s in line.split(',')]
if not line.startswith(' ') or id_ != 'id' or data_ != 'data' or type_ != 'Type' or transit_ != 'Transit':
raise ValueError()
except ValueError:
raise ValueError("String in wrong format")
if len(type_transit) > 2 or not type_transit.isdigit():
raise ValueError("Type Transit is not a one- or two-digit number.")
_ = id.split('.')
if len(_) != 4 or not all(c.isdigit() and 0 <= int(c) < 256 for c in _):
raise ValueError("Invalid IP address for 'id'.")
_ = data.split('.')
if len(_) != 4 or not all(c.isdigit() and 0 <= int(c) < 256 for c in _):
raise ValueError("Invalid IP address for 'data'.")
return id, data
Sample usage:
ip, data = extract_ip_addresses(' id 123.45.67.89, data 98.76.54.210, Type Transit 53')
ip == '123.45.67.89'
data == '98.76.54.210'
try:
extract_ip_addresses('id 1234.5.67.89, data 98.76.54.210, Type Transit 12')
except ValueError as e: # Invalid IP adderess for 'id'
print 'Failed as expected, %s' % e
You could also return
instead of raising a ValueError, depending on how you want to use it. Then you would check if extract_ip_addresses(line) is None
instead of try
ing it.