Well, if you're able to use the newer regex module
in Python
, you can define subpatterns and use the following approach:
- Define subpatterns for the IP address in the beginning
- ... as well as the Incoming and Outgoing Interface
- Parse the Interfaces separately
- See a demo on regex101.com.
Define subpatterns
Define subpatterns for the Incoming
and Outgoing Interface
strings, the IP adress
and the end.
(?(DEFINE)
(?<ips>[^()]+)
(?<incoming>Incoming\ Interface \ List)
(?<outgoing>Outgoing\ Interface \ List)
(?<end>^$|\Z)
)
Put the regex together
Anchor the IP part to the beginning of the line and use a tempered greedy token with negative lookaheads for the incoming/outgoing part.
^\((?P<ip>(?&ips))\)
(?:(?!(?&incoming))[\s\S]+?)
(?&incoming)[\r\n]
(?P<in>(?!(?&outgoing))[\s\S]+?) # tempered greedy token
(?&outgoing)[\r\n]
(?P<out>(?!^$)[\s\S]+?)
(?&end)
Parse the incoming/outgoing parts
As you only need the interfaces types/names, you can simply come up with:
TenGig\S+ # TenGig, followed by anything NOT a whitespace
Hints
You do not really need to define subpatterns but then you'll need to repeat yourself a lot (because of the neg. lookaheads). So if you need to stick with the original re
module, you can very well use this as well.
Glued together
All glued together in code, this will be:
import regex as re
string = """
(192.168.1.1,232.0.6.8) RPF nbr: 55.44.23.1 Flags: RPF
Up: 4w1d
Incoming Interface List
TenGigE0/0/0/1 Flags: A, Up: 4w1d
Outgoing Interface List
TenGigE0/0/0/10 Flags: A, Up: 4w1d
(192.168.55.3,232.0.10.69) RPF nbr: 66.76.44.130 Flags: RPF
Up: 4w1d
Incoming Interface List
TenGigE0/0/0/0 Flags: A, Up: 4w1d
TenGigE0/1/0/0 Flags: A, Up: 4w1d
TenGigE0/2/0/0 Flags: A, Up: 4w1d
Outgoing Interface List
TenGigE0/0/0/10 Flags: A, Up: 4w1d
TenGigE0/3/0/0 Flags: A, Up: 4w1d
TenGigE0/4/0/0 Flags: A, Up: 4w1d
"""
rx = re.compile(r"""
(?(DEFINE)
(?<ips>[^()]+)
(?<incoming>Incoming\ Interface \ List)
(?<outgoing>Outgoing\ Interface \ List)
(?<end>^$|\Z)
)
^\((?P<ip>(?&ips))\)
(?:(?!(?&incoming))[\s\S]+?)
(?&incoming)[\r\n]
(?P<in>(?!(?&outgoing))[\s\S]+?)
(?&outgoing)[\r\n]
(?P<out>(?!^$)[\s\S]+?)
(?&end)
""", re.MULTILINE|re.VERBOSE)
rxiface = re.compile(r'TenGig\S+')
result = dict()
for match in rx.finditer(string):
key = match.group('ip')
incoming = rxiface.findall(match.group('in'))
outgoing = rxiface.findall(match.group('out'))
result[key] = {'incoming': incoming, 'outgoing': outgoing}
print result
# {'192.168.1.1,232.0.6.8': {'outgoing': ['TenGigE0/0/0/10'], 'incoming': ['TenGigE0/0/0/1']}, '192.168.55.3,232.0.10.69': {'outgoing': ['TenGigE0/0/0/10', 'TenGigE0/3/0/0', 'TenGigE0/4/0/0'], 'incoming': ['TenGigE0/0/0/0', 'TenGigE0/1/0/0', 'TenGigE0/2/0/0']}}