I have this log text file:
2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin| Connection id - 0
2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin| Start Time - 2018-11-06 16:52:01
2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin| Statement create or replace table amit (x date);
2018-11-06 16:52:01.817| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0)| admin| Connection id - 0 - Executing - create or replace table amit (x date);
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin| Connection id - 0
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin| End Time - 2018-11-06 16:52:01
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin| SQL - create or replace table amit (x date);
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin| Success
2018-11-06 16:52:14.917| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin| Connection id - 0
2018-11-06 16:52:14.917| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin| Start Time - 2018-11-06 16:52:14
2018-11-06 16:52:14.918| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin| Statement create or replace table amit (x int, y int);
2018-11-06 16:52:14.925| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1)| admin| Connection id - 0 - Executing - create or replace table amit (x int, y int);
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin| Connection id - 0
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin| End Time - 2018-11-06 16:52:15
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin| SQL - create or replace table amit (x int, y int);
3:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| Connection id - 10
2018-11-06 16:53:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| Start Time - 2018-11-06 16:53:25
2018-11-06 16:53:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| Statement select 1;
2018-11-06 16:53:25.954| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15)| admin| Connection id - 10 - Executing - select 1;
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| Connection id - 10
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| End Time - 2018-11-06 16:53:26
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| SQL - select 1;
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| ResultSet Num Of Rows - 1
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| Processed Num Of Rows - 2
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin| Success
2018-11-06 16:52:38.761| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| admin| (stmt : 10) | admin| Failed
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin| Connection id - 0
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin| Start Time - 2018-11-06 16:52:54
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin| Statement delete from amit where y = 111111;
2018-11-06 16:52:54.178| on thread[140447603222272 c0s11]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11)| admin| Connection id - 0 - Executing - delete from amit where y = 111111;
2018-11-06 16:52:54.217| on thread[140447603222272 c0s11]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin| Connection id - 0
This text file is divided to several parts the delimeter is '|' the different parts are as follow:
- Datetime of statement. (example: 2018-11-06 16:52:01.782)
- Thread number. (example: 140447603222272)
- IP of the user running the statement + port.(example: 192.168.0.244:5000)
- The database the user ran the statement to.(example: master)
- IP of the user running the statement.(example: 192.168.0.244)
- User name of the user running the statement.(example: omer)
- Statement id of statement the user executed. (example: 15)
- Service name.(sqream)
- Info column includes - success of statement, the statement itself, connection id, start/end time of the statement, number of rows returned. (changes dynamically)
I want to analyze the text file in a way that I can filter out this kind of messages: How many successful statements were sent by the user? (each statement which passed successfully by 'Success'
How many failed/successful statements sent each user to the server? 'Success' / 'Failed'
How many statements in total were sent by the user according to log
I have implmented the following code:
def parse_log_file(log_file):
print(len(""))
my_path = os.path.abspath(os.path.dirname(__file__))
path = os.path.join(my_path, log_file)
with open(path, 'r') as f:
lines = f.readlines()[1:]
for line in lines:
elements = line.strip().split('|')
print(elements, len(elements))
And I am trying to continue analyze the file but in efficient way I am new to python and I try to understand as I mentioned above. I was thinking about using maybe regex or using tuples in order to keep these data in a key/value design pattern.