I want to parse specific tcpdump
patterns and use optional matches to account for some optional parts (regex101 demo):
10:14:48.983541 IP 10.242.136.232.34266 > 10.81.163.129.9200: Flags [S], seq 2294574211, win 29200, options [mss 1460,sackOK,TS val 22536912 ecr 0,nop,wscale 7], length 0
10:14:48.983541 IP 10.242.136.232 > 10.81.163.129.9200: fictional stuff
10:14:48.983541 IP 10.242.136.232 > 10.81.163.129: also fictional stuff
The general structure for the string is "something, IP address, optional port, the > sign, IP, optional port, colon, something", separated by whitespaces. My match pattern for that is
.+(?P<src_ip>\d*\.\d*\.\d*\.\d*)(?:\.(?P<src_port>\d*))?.>.(?P<dst_ip>\d*\.\d*\.\d*\.\d*)(?:\.(?P<dst_port>\d*))?:\.*
In the demo regex above, it seems that the match is done from the right (mostly correctly) but then something happens on the way to the left and the first octet of the IP (the first \d*
in the pattern) is never matched. Why?
Note: the last two "tcpdump outputs" are technically incorrect, I wanted to show some variations around optional elements.