0

I am new to Python and stackoverflow, very new.

I want to extract the destination port:

2629  >  0 [SYN] Seq=0 Win=512 Len=100
0  >  2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
0  >  2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0

I want to retrieve destination ports for every line: '0' , '2629', '2633' using python regex and ignore the rest (the number that appears after '>' and before '['.

re.findall("\d\d\d\d\d|\d\d\d\d|\d\d\d|\d\d|\d", str)

but this is very generic one. What is the best regex for such scenario?

furas
  • 134,197
  • 12
  • 106
  • 148
  • You're trying to parse the output of some program. Why not do the packet capture in Python directly? E.g. https://stackoverflow.com/questions/4948043/how-to-parse-packets-in-a-python-library – Jonathon Reinhart Nov 23 '19 at 01:26
  • 1
    if you have string then split it using space and get third element `line.split(' ')[2]` – furas Nov 23 '19 at 01:33

2 Answers2

1

You could use the split function on string for this specific case. A quick implementation would be:

dest_ports = []
lines = [
    "2629  >  0 [SYN] Seq=0 Win=512 Len=100", 
    "0  >  2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0", 
    "0  >  2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0"
]

for line in lines:
  dest_ports.append(line.split('>  ')[1].split(' [')[0])

Which would yield the answer:

dest_ports = ['0', '2629', 2633']

Julien Roullé
  • 662
  • 4
  • 15
0

you could use a regex like this:

dff=io.StringIO("""2629  >  0 [SYN] Seq=0 Win=512 Len=100  
0  >  2629 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0  
0  >  2622  [RST, ACK] Seq=1 Ack=1 Win=0 Len=0  
0  >  2633 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0""") 

dff.seek(0) 
for line in dff: 
     print(re.search('(^\d+\s+\>\s+)(\d+)', line).groups()[1]) 
oppressionslayer
  • 6,942
  • 2
  • 7
  • 24