1

I'm trying to read a file.out server file but I need to read only latest data in datetime range.

Is it possible to reverse read file using with open() with modes(methods)?

The a+ mode gives access to the end of the file:

    ``a+''  Open for reading and writing.  The file is created if it does not
      exist. The stream is positioned at the end of the file. Subsequent writes
      to the file will always end up at the then current end of the file, 
      irrespective of any intervening fseek(3) or similar.

Is there a way to use maybe a+ or other modes(methods) to access the end of the file and read a specific range?

Since regular r mode reads file from beginning

    with open('file.out','r') as file:

have tried using reversed()

    for line in reversed(list(open('file.out').readlines())):

but it returns no rows for me.

Or there are other ways to reverse read file... help

EDIT

What I got so far:

import os
import time
from datetime import datetime as dt

start_0 = dt.strptime('2019-01-27','%Y-%m-%d')
stop_0 = dt.strptime('2019-01-27','%Y-%m-%d')
start_1 = dt.strptime('09:34:11.057','%H:%M:%S.%f')
stop_1 = dt.strptime('09:59:43.534','%H:%M:%S.%f')

os.system("touch temp_file.txt")
process_start = time.clock()
count = 0
print("reading data...")
for line in reversed(list(open('file.out'))):
    try:
        th = dt.strptime(line.split()[0],'%Y-%m-%d')
        tm = dt.strptime(line.split()[1],'%H:%M:%S.%f')

        if (th == start_0) and (th <= stop_0):
            if (tm > start_1) and (tm < stop_1):
                count += 1
                print("%d occurancies" % (count))
                os.system("echo '"+line.rstrip()+"' >> temp_file.txt")
        if (th == start_0) and (tm < start_1):
            break
    except KeyboardInterrupt:
        print("\nLast line before interrupt:%s" % (str(line)))
        break
    except IndexError as err:
        continue
    except ValueError as err:
        continue
process_finish = time.clock()
print("Done:" + str(process_finish - process_start) + " seconds.")

I'm adding these limitations so when I find the rows it could atleast print that the occurancies appeared and then just stop reading the file.

The problem is that it's reading, but it's way too slow..

EDIT 2

(2019-04-29 9.34am)

All the answers I received works well for reverse reading logs, but in my (and maybe for other people's) case, when you have n GB size log Rocky's answer below suited me the best.

The code that works for me:

(I only added for loop to Rocky's code):

import collections

log_lines = collections.deque()
for line in open("file.out", "r"):
    log_lines.appendleft(line)
    if len(log_lines) > number_of_rows:
        log_lines.pop()

log_lines = list(log_lines)
for line in log_lines:
    print(str(line).split("\n"))

Thanks people, all the answers works.

-lpkej

lpkej
  • 445
  • 6
  • 23
  • 2
    See https://stackoverflow.com/questions/11696472/seek-function – adrtam Apr 26 '19 at 13:55
  • 1
    Your last `reversed()` method should work although it's very memory inefficient (if your file is large) – Rocky Li Apr 26 '19 at 14:05
  • Duplicate of https://stackoverflow.com/questions/12523044/how-can-i-tail-a-log-file-in-python – brunns Apr 26 '19 at 14:08
  • If your attempt to use `reversed` returned no data, that's because the file was empty when you called `readlines` (which isn't necessary; `list` will iterate over the file object itself). – chepner Apr 26 '19 at 14:16
  • @RockyLi yes it is very memory inefficient – lpkej Apr 26 '19 at 14:21
  • @chepner the file I want to read is 500GB+ so it's not empty – lpkej Apr 26 '19 at 14:21
  • Your `reversed(list(open(FILE)))` will not work since you said you had 500 GB of file. it would work if you had 500GB of memory .. check my answer. – Rocky Li Apr 26 '19 at 15:21

4 Answers4

1

There's no way to do it with open params but if you want to read the last part of a large file without loading that file into memory, (which is what reversed(list(fp)) will do) you can use a 2 pass solution.

LINES_FROM_END = 1000
with open(FILEPATH, "r") as fin:
    s = 0
    while fin.readline(): # fixed typo, readlines() will read everything...
        s += 1
    fin.seek(0)
    mylines = []
    for i, e in enumerate(fin):
        if i >= s - LINES_FROM_END:
            mylines.append(e)

This won't keep your file in the memory, you can also reduce this to one pass by using collections.deque

# one pass (a lot faster):
mylines = collections.deque()
for line in open(FILEPATH, "r"):
    mylines.appendleft(line)
    if len(mylines) > LINES_FROM_END:
        mylines.pop()

mylines = list(mylines)
# mylines will contain #LINES_FROM_END count of lines from the end.
Rocky Li
  • 5,641
  • 2
  • 17
  • 33
  • ```collections.deque()``` worked for me, prints the lines smooth. Thanks Rocky you rock! – lpkej Apr 29 '19 at 06:31
0

Sure there is:

filename = 'data.txt'
for line in reversed(list(open(filename))):
    print(line.rstrip())

EDIT: As mentioned in comments this will read the whole file into memory. This solution should not be used with large files.

mlotz
  • 130
  • 3
  • 2
    That reads the entire file into memory, then *iterates* over the lines in reverse (and isn't substantially different to what the OP has already tried). – chepner Apr 26 '19 at 14:13
  • I've just fixed the issue in response to 'but it returns no rows for me.'. While it is not efficient it might be what the OP is after if he is not working with large files. – mlotz Apr 26 '19 at 14:21
  • @mlotz This is a one way to reverse read, but it's not good for my case. Thanks for answer. – lpkej Apr 26 '19 at 14:34
0

Another option is to mmap.mmap the file and then use rfind from the end to search for the newlines and then slice out the lines.

Dan D.
  • 73,243
  • 15
  • 104
  • 123
-1

Hey m8 I have made this code it works for me I can read in my file in reversed order. hope it helps :) I start by creating a new text file, so I don't know how much that is important for you.

def main():
f = open("Textfile.txt", "w+")
for i in range(10):
    f.write("line number %d\r\n" % (i+1))

f.close
def readReversed():
for line in reversed(list(open("Textfile.txt"))):
    print(line.rstrip())

main()
readReversed()
  • This reverse read code also works, I have tried this. But not really good for me since I have huge log to parse. For other file reading purposes I'm sure this code fits well. Thanks for the answer. – lpkej Apr 29 '19 at 06:49