10

I want to open a file and read each line using f.seek() and f.tell():

test.txt:

abc
def
ghi
jkl

My code is:

f = open('test.txt', 'r')
last_pos = f.tell()  # get to know the current position in the file
last_pos = last_pos + 1
f.seek(last_pos)  # to change the current position in a file
text= f.readlines(last_pos)
print text

It reads the whole file.

smci
  • 32,567
  • 20
  • 113
  • 146
John
  • 3,888
  • 11
  • 46
  • 84
  • 2
    Yes, that's what `readlines` does. What's your question exactly? – Mark Ransom Mar 24 '13 at 03:29
  • I need to read line by line, save the last_pos somewhere, close the file, go and open the file, seek the last_pos, read the line, update the last_pos, close the file... – John Mar 24 '13 at 03:34
  • @John, if you're passing data between subprocesses, look at StringIO etc. Or consider using a database e.g. MySQL – smci Nov 29 '16 at 13:45

4 Answers4

20

ok, you may use this:

f = open( ... )

f.seek(last_pos)

line = f.readline()  # no 's' at the end of `readline()`

last_pos = f.tell()

f.close()

just remember, last_pos is not a line number in your file, it's a byte offset from the beginning of the file -- there's no point in incrementing/decrementing it.

lenik
  • 23,228
  • 4
  • 34
  • 43
  • 1
    lenik: I cold not understand the file reading process in your answer in http://stackoverflow.com/questions/15527617/read-each-line-of-a-text-file-using-cron-schedule. So I open a new question here:) – John Mar 24 '13 at 03:44
  • ok, here's how it goes. you have a variable `last_pos`, which contains the current byte offset from the beginning of the file. you open the file, `seek()` to that offset, then read a line using `readline()`. file pointer automatically advances to the beginning of the next line. then you use `tell()` to get the new offset and save it to the `last_pos` to be used on the next iteration. please, point out which part of this process is not clear, I'll try to explain in more details. – lenik Mar 24 '13 at 03:48
  • you're welcome! =) sorry I did not explain it well the first time – lenik Mar 24 '13 at 03:49
  • @lenik the 's' in `readlines` is not a typo, it's another implemented method ([doc](http://www.tutorialspoint.com/python/file_readlines.htm)) – SAAD Oct 15 '15 at 12:47
  • 1
    A great blog post on this method: http://www.blopig.com/blog/2016/08/processing-large-files-using-python/ – duhaime Jun 05 '18 at 00:23
  • This was really helpful. Thanks! – tpoker Jul 18 '18 at 17:44
  • Let me add this would not work if using codecs.open (would have saved me some time to know that!) – rgalhama Nov 01 '18 at 09:32
  • Thanks for mentioning file.tell(). I always appreciate direct answers as opposed to "why are you doing that? you should do it this way" When OP's are asking a question, they have to balance giving enough detail and too much detail which would unnecessarily complicate the quesiton. In my case, i want to call a function to grab the next "n" lines of a file for a batch process on huge files. I knew i could use the offset, but didn't know how to get the current offset, so thanks! – Jeremy Giaco Apr 24 '19 at 15:38
2

Is there any reason why you have to use f.tell and f.seek? The file object in Python is iterable - meaning that you can loop over a file's lines natively without having to worry about much else:

with open('test.txt','r') as file:
    for line in file:
        #work with line
Sean Johnson
  • 5,567
  • 2
  • 17
  • 22
0

A way for getting current position When you want to change a specific line of a file:

cp = 0 # current position

with open("my_file") as infile:
    while True:
        ret = next(infile)
        cp += ret.__len__()
        if ret == string_value:
            break
print(">> Current position: ", cp)
0

Skipping lines using islice works perfectly for me and looks like is closer to what you're looking for (jumping to a specific line in the file):

from itertools import islice

with open('test.txt','r') as f:
    f = islice(f, last_pos, None)
    for line in f:
        #work with line

Where last_pos is the line you stopped reading the last time. It will start the iteration one line after last_pos.

lotif
  • 3,401
  • 2
  • 19
  • 18