0

I have a text file below, trying to extract string if last part of line is digit

4:16:09PM - xx yy DOS activity from 10.0.0.45
9:43:44PM - xx yy 1A disconnected from server
2:40:28AM - xx yy 1A connected
1:21:52AM - xx yy DOS activity from 192.168.123.4

My Code

with open(r'C:\Users\Desktop\test.log') as f:
    for line in f:
        dos= re.findall(r'\d',line.split()[-1])
        print (list(dos))

My Out

['1', '0', '0', '0', '4', '5']
[]
[]
['1', '9', '2', '1', '6', '8', '1', '2', '3', '4']

Expected

['10.0.0.45','192.168.123.4']

2 Answers2

3

I guess,

(?m)(?:\d+\.){3}\d+$

might simply extract those desired IPs.

RegEx Demo

Test

import re

string = '''
4:16:09PM - xx yy DOS activity from 10.0.0.45
9:43:44PM - xx yy 1A disconnected from server
2:40:28AM - xx yy 1A connected
1:21:52AM - xx yy DOS activity from 192.168.123.4
'''

expression = r'(?m)(?:\d+\.){3}\d+$'


print(re.findall(expression, string))

Output

['10.0.0.45', '192.168.123.4']

If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.


RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Community
  • 1
  • 1
Emma
  • 27,428
  • 11
  • 44
  • 69
1

could also take this approach, also implementing the check as to whether or not the last character in the line is a digit:

with open('test.log') as f:
    for line in f:
        if line.strip()[-1].isdigit():
            dos = re.findall('[0-9]+.[0-9]+.[0-9]+.[0-9]+',line)
            print(dos)

output:

['10.0.0.45']
['192.168.123.4']

to put them into one list you can define an empty list and continually append to it if desired

Derek Eden
  • 4,403
  • 3
  • 18
  • 31