0

I have a dir with log files all ending with *.log Is it possible to read all files make one big file and split the lines on finding a "date" The log files looks something like this:

2019-04-15 21:58:07 bla bla bla
2019-04-15 21:58:08 bla bla bla bla
2019-04-15 21:58:09 bla bla bla
test1
test2
test3
2019-04-15 21:59:02 bla bla
2019-04-15 21:59:05 bla bla bla
test
now
go

Now i would like to split this file in lines when finding a date so that it would be like:

2019-04-15 21:58:07 bla bla bla
2019-04-15 21:58:08 bla bla bla bla
2019-04-15 21:58:09 bla bla bla test1 test2 test3
2019-04-15 21:59:02 bla bla
2019-04-15 21:59:05 bla bla bla test now go

Can somebody help me with this?

Kind regards

Trey K
  • 23
  • 5
  • What have you tried so far? If so what errors are you running into? – R. Arctor Sep 13 '19 at 16:01
  • I have tried to o some thing with regex like search for a pattern pattern = re.compile("([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))") # yyy-mm-dd but cant seem to make the good code to get the above result – Trey K Sep 13 '19 at 17:50
  • import os, re log = open("log.log", "r") text = log.read() lon = re.compile("([12]\d{3}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01]))", re.MULTILINE) lon = lon.search(text).group(1) print(lon) – Trey K Sep 13 '19 at 17:57
  • @R. Arctor, can you help me please? – Trey K Sep 13 '19 at 18:04
  • So for clarity's sake, you want to 1. concatenate all logs in some directory into a single file. 2. Move lines without the leading date string to the last line with a leading data string. 3. Write modified and concatenated logs to a "master" log file. Is that right? – R. Arctor Sep 13 '19 at 18:21
  • Yeah! youre right! thats the wish – Trey K Sep 13 '19 at 18:23

1 Answers1

1

It's not pretty and it could probably be more efficient but this works

import os, re

# change this to be wherever you keep all those log files
work_dir = '/home/ubuntu/workspace/bin/tmp'

# load the full path for all files in the work_dir (I'm not checking if file is a .log file)
logs = [os.path.join(work_dir, file) for file in os.listdir(work_dir) if os.path.isfile(os.path.join(work_dir, file))]


def process_list(in_list):
    date_patt = r'\d{4}-\d{2}-\d{2}[\s]+\d{2}:\d{2}:\d{2}'
    last_good_idx = 0
    for idx in range(len(in_list)):
        if re.search(date_patt, in_list[idx]):
            last_good_idx = idx
        else:
            in_list[last_good_idx] += f' {in_list[idx].strip()}'

    return in_list

def clean_list(in_list):
    date_patt = r'\d{4}-\d{2}-\d{2}[\s]+\d{2}:\d{2}:\d{2}'
    for elem in in_list[:]:
        if not re.search(date_patt, elem):
            in_list.remove(elem)
    return in_list

# write master log to working directory file called master.log
with open(os.path.join(work_dir, 'master.log'), 'w') as out:
    for file in logs:
        with open(file, 'r') as f:
            file_text = f.read()
            text_list = file_text.split('\n')
            text_list = process_list(text_list)
            text_list = clean_list(text_list)

            for line in text_list:
                out.write(line + '\n')

If you only wanted to use files ending in .log add it to the list comprehension that assigns the logs variable.

process_list handles moving lines which don't match the date_patt regex to the end of the string found at the last index where the date_patt was matched.

clean_list removes any element from the input list that doesn't match the date_patt.

R. Arctor
  • 718
  • 4
  • 15
  • SUPER! perfect! – Trey K Sep 14 '19 at 17:17
  • No problem, let me know if there's a part you don't understand, or need help improving. – R. Arctor Sep 14 '19 at 17:21
  • there is one thing, is it possible to check every subdir from the work dir "./" for *.log files and make one master file of it? now it check only the work dir and no futjer – Trey K Sep 14 '19 at 17:55
  • That sort of thing has been answered on here plenty of times. See if you can work it out yourself, if you can't post what you've tried and whatnot. Helpful link: https://stackoverflow.com/questions/3964681/find-all-files-in-a-directory-with-extension-txt-in-python – R. Arctor Sep 16 '19 at 00:31