0

I have a big log file, and I want to read the relevant part from this log.

Every section start with ###start log###, so I need to search the last occurrence of ###start log###, and read the lines until the end of the file.

I see a solution that can search a line by it seek (number), but I don't know it, I know only the content of the line.

What is the best solution for this case?

Or Smith
  • 3,556
  • 13
  • 42
  • 69
  • I'd say reading line by line, keeping track of the last "###start log###" read and when you encounter the EOF you use the index of the last encountered line. – tfrascaroli Sep 23 '14 at 14:40
  • If you happen to be able to use bash (say, on Linux), you could this with this (tad awful) one-liner: `tail -n+$(grep -n '###start log###' logfile | tail -n1 | awk -F':' ' { print $1 }') logfile`. –  Sep 23 '14 at 14:54

3 Answers3

1

I'd suggest reading the file backwards until the first occurrence of the start tag. You may do it in one of two ways: if the file fits into memory try this: Read a file in reverse order using python

If the file is too large - you may find this link helpful: http://code.activestate.com/recipes/120686-read-a-text-file-backwards/

Community
  • 1
  • 1
Max
  • 315
  • 1
  • 13
1

Given the size of the file, you basically need to read the file in reverse order. There are some posts on how to read a file in reverse order in python; If you are on a unix system, you may also take a look at unix tac command, then read the output through a pipe and stop when you hit the start of the log:

>>> from subprocess import PIPE, Popen
>>> from itertools import takewhile
>>> with Popen(['tac', 'tmp.txt'], stdout=PIPE) as proc:
...     iter = takewhile(lambda line: line != b'###start log###\n', proc.stdout)
...     lines = list(iter)

Then the last log lines in correct order would be:

>>> list(reversed(lines))
behzad.nouri
  • 74,723
  • 18
  • 126
  • 124
0
with open(filename) as handle:
    text = handle.read()
lines = text.splitlines()
lines.reverse()
i = next(i for i, line in enumerate(lines) if line == '###start log###')
relevant_lines = lines[:i]
relevant_lines.reverse()
user2085282
  • 1,077
  • 1
  • 8
  • 16
  • You can just do `lines = handle.readlines()` in the `with` block. – Cody Piersall Sep 23 '14 at 14:53
  • True, but it was more for explanatory purposes – user2085282 Sep 23 '14 at 14:54
  • 1
    Right... but it's a better explanation the other way, you know what I mean? The way you've written it, you're reading the entire file in, then iterating over all the contents of the file. That's twice as slow for no benefit, methinks. And you might encourage bad habits in the OP! – Cody Piersall Sep 23 '14 at 15:03