3

I use python open large log file like

Thu Oct  4 23:14:40 2012 [pid 16901] CONNECT: Client "66.249.74.228"
Thu Oct  4 23:14:40 2012 [pid 16900] [ftp] OK LOGIN: Client "66.249.74.228", anon     password "googlebot@google.com"
Thu Oct  4 23:17:42 2012 [pid 16902] [ftp] FAIL DOWNLOAD: Client "66.249.74.228",   "/pub/10.5524/100001_101000/100039/Assembly-2011/Pa9a_assembly_config4.scafSeq.gz",  14811136 bytes, 79.99Kbyte/sec
Fri Oct  5 00:04:13 2012 [pid 25809] CONNECT: Client "66.249.74.228"
Fri Oct  5 00:04:14 2012 [pid 25808] [ftp] OK LOGIN: Client "66.249.74.228", anon password "googlebot@google.com"
Fri Oct  5 00:07:16 2012 [pid 25810] [ftp] FAIL DOWNLOAD: Client "66.249.74.228", "/pub/10.5524/100001_101000/100027/Raw_data/PHOlcpDABDWABPE/090715_I80_FC427DJAAXX_L8_PHOlcpDABDWABPE_1.fq.gz", 14811136 bytes, 79.99Kbyte/sec
Fri Oct  5 00:13:19 2012 [pid 27354] CONNECT: Client "1.202.186.53"
Fri Oct  5 00:13:19 2012 [pid 27353] [ftp] OK LOGIN: Client "1.202.186.53", anon password "mozilla@example.com"

I want to read the lines from the end of file like tail command to get the recently 7 days record.

Here is my code, how can i change it.

import time
f= open("/opt/CLiMB/Storage1/log/vsftp.log")
def OnlyRecent(line):
   if  time.strptime(line.split("[")[0].strip(),"%a %b %d %H:%M:%S %Y")>     time.gmtime(time.time()-(60*60*24*7)): 
    return True
return False
filename= time.strftime('%Y%m%d')+'.log'
f1= open(filename,'w')
for line in f:
 if OnlyRecent(line):
        print line
        f1.write(line)
f.close()
f1.close()
AntiGMO
  • 1,535
  • 5
  • 23
  • 38

3 Answers3

3

Use file.seek() to jump to some offset from the end of a file. For example, to print the last 1Kb of a file without reading the beginning of a file, do this:

with open("/opt/CLiMB/Storage1/log/vsftp.log") as f:
     f.seek(-1000, os.SEEK_END)
     print f.read()
Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
0

I didn't check this, just reformat code:

  1. less verbose import from time module
  2. dropwhile instead of for..if
  3. with context to open/close files
  4. PEP8
  5. miscs

-

from time import time, gmtime, strptime
from itertools import dropwhile

deadline = gmtime(time()-(60*60*24*7))
formatting = "%a %b %d %H:%M:%S %Y"

def not_recent(line):
    return strptime(line.split("[")[0].strip(), formatting) <= deadline

with open("/opt/CLiMB/Storage1/log/vsftp.log") as f:
    filename = time.strftime('%Y%m%d')+'.log'
    with open(filename,'w') as f1:
        for line in dropwhile(not_recent, f):
            print line
            f1.write(line)
Alexey Kachayev
  • 6,106
  • 27
  • 24
0

Another Implementation, considering you are dealing with huge log files

def tail(fname, n):
    fin = os.open(fname,os.O_RDONLY ) #Get an open file desc
    size = os.fstat(fin).st_size #Get the size from the stat
    fin = os.fdopen(fin) #Convert fd to file obj
    count = 0
    fin.seek(size) #Seek to the end of the file
    try:
        while count < n: #Loop until the count of newlines exceed the tail size
            pos = fin.tell() - 2 #Step backward
            if pos == -1: #Until you are past the begining
                raise StopIteration #When you end the Loop
            fin.seek(pos)
            if fin.read(1) == '\n': #And check if the next character is a new line
                count += 1 #Maintaining the count
    except StopIteration:
        pass

    return fin

Usage

for e in tail("Test.log",10):
    print e
Abhijit
  • 62,056
  • 18
  • 131
  • 204