I have a zipped (.gz) log file logfile.20221227.gz. I am writing a python script to process it. I did a test run with file which had 100 lines and the script worked fine. When i ran the same script on actual log file which is almost 5GB the script gets broken. Note that i was able to process log files upto 2GB. Unfortunately the only log file heavier than this is 5GB+ or 7GB+ and the script fails with both of them. My code is as below.
count = 0
toomany = 0
maxhits = 5000
logfile = '/foo/bar/logfile.20221228.gz'
with gzip.open(logfile, 'rt', encoding='utf-8') as page:
for line in page:
count += 1
print("\nFor loop count is: ",count)
string = line.split(' ', 5)
if len(string) < 5:
continue
level = string[3]
shortline = line[0:499]
if level == 'FATAL':
log_lines.append(shortline)
total_fatal += 1
elif level == 'ERROR':
log_lines.append(shortline)
total_error += 1
elif level == 'WARN':
log_lines.append(shortline)
total_warn += 1
if not toomany and (total_fatal + total_error + total_warn) > max_hits:
toomany = 1
if len(log_lines) > 0:
send_report(total_fatal, total_error, total_warn, toomany, log_lines, max_hits)
Output:
For loop count is: 1
.
.
For loop count is: 192227123
Killed
What does the Killed
means here? It does not offer much to investigate just with this one keyword. Also is there a limit on file size and is there a way to bypass it.
Thank you.