0

I've just started to play with Python and I'm trying to do some tests on my environment ... the idea is trying to create a simple script to find the recurrence of errors in a given period of time.

Basically I want to count the number of times a server fails on my daily logs, if the failure happens more than a given number of times (let's say 10 times) over a given period of time (let's say 30 days) I should be able to raise an alert on a log, but, I´m not trying to just count the repetition of errors on a 30 day interval... What I would actually want to do is to count the number of times the error happened, recovered and them happened again, this way I would avoid reporting more than once if the problem persists for several days.

For instance, let's say :

file_2016_Oct_01.txt@hostname@YES
file_2016_Oct_02.txt@hostname@YES
file_2016_Oct_03.txt@hostname@NO
file_2016_Oct_04.txt@hostname@NO
file_2016_Oct_05.txt@hostname@YES
file_2016_Oct_06.txt@hostname@NO
file_2016_Oct_07.txt@hostname@NO

Giving the scenario above I want the script to interpret it as 2 failures instead of 4, cause sometimes a server may present the same status for days before recovering, and I want to be able to identify the recurrence of the problem instead of just counting the total of failures.

For the record, this is how I'm going through the files:

# Creates an empty list
history_list = []

# Function to find the files from the last 30 days

def f_findfiles():
    # First define the cut-off day, which means the last number 
    # of days which the scritp will consider for the analysis
    cut_off_day = datetime.datetime.now() - datetime.timedelta(days=30)

    # We'll now loop through all history files from the last 30 days
    for file in glob.iglob("/opt/hc/*.txt"):
        filetime = datetime.datetime.fromtimestamp(os.path.getmtime(file))
        if filetime > cut_off_day:
            history_list.append(file)

# Just included the function below to show how I'm going 
# through the files, this is where I got stuck...

def f_openfiles(arg):
    for file in arg:
        with open(file, "r") as file:
            for line in file:
                clean_line = line.strip().split("@")

# Main function
def main():
    f_findfiles()
    f_openfiles(history_list)

I'm opening the files using 'with' and reading all the lines from all the files in a 'for', but I'm not sure how I can navigate through the data to compare the value related to one file with the older files.

I've tried putting all the data in a dictionary, on a list, or just enumerating and comparing, but I've failed on all these methods :-(

Any tips on what would be the best approach here? Thank you!

silveiralexf
  • 514
  • 1
  • 7
  • 23
  • I'm confused a bit here.... do the lines in the log look like `file_2016_Oct_01.txt@hostname@YES` or are you saying there are files called `file_2016_Oct_01.txt` that have something inside them? Part of the solution is to make sure the records are read from oldest to newest so that state can be tracked. – tdelaney Oct 13 '16 at 01:16
  • There are multiple files, each file got a line for each server (around 400 servers) with the status of the day. – silveiralexf Oct 13 '16 at 01:39
  • Okay... so are the files named like `file_2016_Oct_01.txt`? Are the lines in the file like `hostname@YES\n`? I want to know if there is a good way to read the files by date. I can't figure out what you mean by `file_2016_Oct_01.txt@hostname@YES` and actually breaking that out or telling us that the full thing is a filename would be helpful. – tdelaney Oct 13 '16 at 01:52
  • Yes, all files have the date as part of their name, such as: "hc..
    ..txt". Eg: hc.10.12.16.txt
    – silveiralexf Oct 13 '16 at 01:58

1 Answers1

0

I'd better handle such with shell utilities (i.e uniq), but, as long as you prefer to use python:

With minimal effor, you can handle it creating appropriate dict object with stings (like 'file_2016_Oct_01.txt@hostname@YES') being the keys. Iterating over log, you'd check corresponding key exists in dictionary (with if 'file_2016_Oct_01.txt@hostname@YES' in my_log_dict), then assign or increment dict value appropriately.

A short sample:

data_log = {}

lookup_string = 'foobar'
if lookup_string in data_log:
    data_log[lookup_string] += 1
else:
    data_log[lookup_string] = 1

Alternatively (one-liner, yet it looks ugly in python most of time, I have had edited it to use line breaks to be visible):

data_log[lookup_string] = data_log[lookup_string] + 1 \
    if lookup_string in data_log \
    else 1
agg3l
  • 1,444
  • 15
  • 21
  • I've tried your suggestion, but the problem is exactly how to navigate on the dictionary in order to compare the values from one file with the value from another... if recurrence["status"] == "NO": recurrence.update({"last_status": counter + 1 }) print (recurrence["file"], recurrence["host"],recurrence["status"],recurrence["last_status"]) I've tried searching for the status and incrementing it, but I've ended up creating a repeated entry on the dictionary, not being able to compare with the last one... – silveiralexf Oct 13 '16 at 00:42
  • here is some dummy code as a sample: `dict[some_key] = dict[some_key] + 1 if some_key in dict else 1` – agg3l Oct 13 '16 at 00:44
  • Seems I've got your idea/issue wrong. Please provide some details in such case. Your's sample doesn't clarifies it much, unfortunately (why 2/4 errors are there) – agg3l Oct 13 '16 at 00:52
  • Couldn't it be just locating status state toggles ? (i.e YES->NO / NO->YES transitions) – agg3l Oct 13 '16 at 00:56
  • The thing is, I'm trying to count the number of times the error appear on the daily log marked with "NO" string, and in case the error persists for several days in a row I still want to consider it to be one ocurrence. – silveiralexf Oct 13 '16 at 00:58
  • In case the error message recovers (that means, on the next log file it's marked with string "YES"), and them fails again on any the next files with "NO" string, them I would consider it to be another failure.... The idea is to only count these recurrences, instead of just counting the occurence of the stirng... – silveiralexf Oct 13 '16 at 01:00
  • Yes, I've realized my initial answer was not much related to your question now. I'm not sure discussing it in comments is a best idea on SO, but, have you considered maintaining smth. like to two tuples/objects/dicts/whatever, for good and bad results, e.g. date, count, so you can compare date/status differs for each line of log entries you parse – agg3l Oct 13 '16 at 01:08