1

The short(ish) version of this question is: When you open a file using a text editor and search for a term you can, after locating the term, move around in the file showing flexible context. So, as a direct example, if you have a Log file you could open it using less mylog.log and search /SALLY. This would take you to the first occurence of 'SALLY' in the log file. Then, using normal navigation keys(up and down arrow keys, pg up/dwn, etc) you can see what happened before and after the word 'SALLY' appeared. I would like to leverage a tool to give this same behavior but none of the tools I've looked into seem quite right. It currently looks as though the only option is to write my own methods for doing this, but surely that's not right.

Long version of this question:I have a bunch of log files scattered all over the place. There is a part of my normal workflow that involves searching for values in these log files and getting information from the context around those values(it is worth noting that I cannot assume context is within a specific set of lines nor do I know until I see it what the important context is.) Manually going everywhere to get these log files is gross, I want to tell my code 'look for SALLY' the code should give me a list of places(from a list of known places where log files reside) where 'SALLY' appears. I then select the logfile I want and it opens to the first occurrence of 'SALLY' with the ability to navigate in the file from that point.

I know how to do most of this and, in fact, I can and have implemented everything but the last bit. Using basic IO operations I can:

  • Find and access all the potential log files
  • Find log files with 'SALLY' in them
  • Give the user a list with all the log files with 'SALLY' in them
  • Given a selected logfile display the line(s) that contain 'SALLY'

What I can't do is figure out how to give the use the ability to smoothly navigate the log file. Allowing them to move up and down the file so they can see context. I could, and have, placed a call to 'less'(assuming it's on a *nix system) and used it's search behavior but that's really not the behavior I'd like. I'd like to do this all using Python.

I've looked at Elastic Search(which seems to be way beyond what I want), several log parsing libraries(parsing the logs are pretty straight forward) and just tried to find other's solutions to a similar problem. I've been unable to find anyone with a similar problem let alone a solution which, given the python community, seems unlikely.

I'm currently considering implementing some sort of custom file viewer. This seems silly. What can I leverage to implement this sort of functionality?

Nahkki
  • 862
  • 8
  • 17
  • Are the logfiles so large that it would be unreasonable to read the file the user requests into memory? If not, why not just have a `list(enumerate(file.readlines()))`, keep a variable for the current line displayed, and display line+-1 if the user presses down/up? – timgeb Jul 01 '14 at 16:29
  • Do you use a commandline interface or a GUI? If you use a GUI, which framework? – wastl Jul 01 '14 at 16:29
  • @timgeb - this is actually what I'm doing right now but it feels clunky. I'd like to not have to manually deal with the ends of the file, reloading the data each time seems unpythonic and some of the files are quite large(which is not a big deal to run through them once or, as I'm doing now, only loading the last # lines which is less than ideal. – Nahkki Jul 01 '14 at 16:59
  • @wastl - I'm currently using a commandline interface. I'm not opposed to migrating to something else if it has the functionality I need. – Nahkki Jul 01 '14 at 17:00
  • @Nahkki: How many lines up and down do you generally need, or does this number vary a lot? – wastl Jul 01 '14 at 17:05
  • @wastl - it can vary a huge amount. Not only from log file to log file but also within different searches. The logs contain all sorts of data including some pretty hairy JSON/XML data from restful api calls. – Nahkki Jul 01 '14 at 17:23
  • @Nahkki: Would it be acceptable to load the whole file and keep it in memory, or are the files too big? – wastl Jul 01 '14 at 17:41

1 Answers1

1

So, after playing around a bit I found something that worked pretty good for me, hope it will work for you too. The basic idea is, that we have some kind of iterator (not a real one but because I lack imagination i called it iterator) and it keeps track of the range you are looking at and returns the current section you are looking at.

It is just a quick and dirty solution but I hope it does the job

from subprocess import call

def main():
    fp = open('path/to/your/file')
    f = fp.readlines()
    fp.close()
    myIter = MyIterator(f,12)
    #                      ^replace with the actual index the line you want to look at
    print myIter.current()
    cmd = raw_input()

    #Input is no optimal, but this is beyond the scope of your question

    while cmd != "quit":
        call(["clear"])
        if cmd == "u":
            myIter.previous()
        elif cmd == "d":
            myIter.next()
        for line in myIter.current():
            print line
        cmd = raw_input()

class MyIterator():
    def __init__(self,f,index):
        self.f = []
        for line in f:
            #Otherwise you would have a blank line between every line
            self.f.append(line.replace('\n',''))
        self.upper_index = index-1
        self.lower_index = index

    def hasNext(self):
        if self.upper_index > len(self.f):
            return False
        else:
            return True

    def hasPrevious(self):
        if self.lower_index <= 0:
            return False
        else:
            return True

    def next(self):
        self.upper_index += 1
        return self.current()

    def previous(self):
        self.lower_index -= 1
        return self.current()

    def current(self):
        return self.f[self.lower_index:self.upper_index]

if __name__ == "__main__":
    main()

Note that with 'u' you go up one line and with 'd' you go down one line. The poblem is, that you also have to press enter afterwards. Look here for an implementation of getch() in python

Community
  • 1
  • 1
wastl
  • 2,643
  • 14
  • 27