0

I have this log text file:

2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Connection id - 0
2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Start Time - 2018-11-06 16:52:01
2018-11-06 16:52:01.782| on thread[140447603222272 c0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Statement create or replace table amit (x date);
2018-11-06 16:52:01.817| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0)| admin|  Connection id - 0 - Executing - create or replace table amit (x date);
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Connection id - 0
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  End Time - 2018-11-06 16:52:01
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  SQL - create or replace table amit (x date);
2018-11-06 16:52:01.901| on thread[140447603222272 c0s0]| IP[192.168.0.244:5000]| master| 192.168.0.244| omer| (stmt : 0) | admin|  Success
2018-11-06 16:52:14.917| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Connection id - 0
2018-11-06 16:52:14.917| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Start Time - 2018-11-06 16:52:14
2018-11-06 16:52:14.918| on thread[140447603222272 c0s0]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Statement create or replace table amit (x int, y int);
2018-11-06 16:52:14.925| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1)| admin|  Connection id - 0 - Executing - create or replace table amit (x int, y int);
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  Connection id - 0
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  End Time - 2018-11-06 16:52:15
2018-11-06 16:52:15.160| on thread[140447603222272 c0s1]| IP[192.168.0.244:5001]| master| 192.168.0.244| admin| (stmt : 1) | admin|  SQL - create or replace table amit (x int, y int);
3:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Connection id - 10
2018-11-06 16:53:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Start Time - 2018-11-06 16:53:25
2018-11-06 16:53:25.925| on thread[140447603222272 c10s14]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Statement select 1;
2018-11-06 16:53:25.954| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15)| admin|  Connection id - 10 - Executing - select 1;
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Connection id - 10
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  End Time - 2018-11-06 16:53:26
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  SQL - select 1;
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  ResultSet Num Of Rows - 1
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Processed Num Of Rows - 2
2018-11-06 16:53:26.25| on thread[140447603222272 c10s15]| IP[192.168.0.244:5000]| master| 192.168.0.244| Guy| (stmt : 15) | admin|  Success
2018-11-06 16:52:38.761| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| admin| (stmt : 10) | admin|  Failed
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Connection id - 0
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Start Time - 2018-11-06 16:52:54
2018-11-06 16:52:54.103| on thread[140447603222272 c0s7]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Statement delete from amit where y = 111111;
2018-11-06 16:52:54.178| on thread[140447603222272 c0s11]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11)| admin|  Connection id - 0 - Executing - delete from amit where y = 111111;
2018-11-06 16:52:54.217| on thread[140447603222272 c0s11]| IP[192.168.0.244:5000]| master| 192.168.0.244| Gilc| (stmt : 11) | admin|  Connection id - 0

This text file is divided to several parts the delimeter is '|' the different parts are as follow:

  1. Datetime of statement. (example: 2018-11-06 16:52:01.782)
  2. Thread number. (example: 140447603222272)
  3. IP of the user running the statement + port.(example: 192.168.0.244:5000)
  4. The database the user ran the statement to.(example: master)
  5. IP of the user running the statement.(example: 192.168.0.244)
  6. User name of the user running the statement.(example: omer)
  7. Statement id of statement the user executed. (example: 15)
  8. Service name.(sqream)
  9. Info column includes - success of statement, the statement itself, connection id, start/end time of the statement, number of rows returned. (changes dynamically)

I want to analyze the text file in a way that I can filter out this kind of messages: How many successful statements were sent by the user? (each statement which passed successfully by 'Success'

How many failed/successful statements sent each user to the server? 'Success' / 'Failed'

How many statements in total were sent by the user according to log

I have implmented the following code:

def parse_log_file(log_file):
    print(len(""))
    my_path = os.path.abspath(os.path.dirname(__file__))
    path = os.path.join(my_path, log_file)
    with open(path, 'r') as f:
        lines = f.readlines()[1:]
        for line in lines:
            elements = line.strip().split('|')
            print(elements, len(elements))

And I am trying to continue analyze the file but in efficient way I am new to python and I try to understand as I mentioned above. I was thinking about using maybe regex or using tuples in order to keep these data in a key/value design pattern.

Sowjanya R Bhat
  • 1,128
  • 10
  • 19
tupac shakur
  • 658
  • 1
  • 12
  • 29

0 Answers0