1

A nice way to iterate through large text files line by line (starting from the beginning) without loading the whole thing into memory using python is by doing:

with open('textfile.txt') as infile:
    for line in infile:
        do_something

Is there an easy way to do the same line by line iteration without loading the file into memory, but starting from the end of the file?

Community
  • 1
  • 1
Ryan
  • 3,555
  • 1
  • 22
  • 36
  • 1
    I don't think there is an as easy way. See for example this answer: http://stackoverflow.com/a/3568878/6614295 – jotasi Jul 30 '16 at 08:36
  • The only way to get the starting point of each line in a text file is to read the file from start to end. (Unless all lines are of equal length.) Even if you guess and start at `end - 100`, you'd be assuming lines are all < 100 characters, and you'd still need to scan *forward* to see if there are more lines after this point. – Jongware Jul 30 '16 at 09:01
  • 1
    You can _map_ a file using the [`mmap`](https://docs.python.org/3/library/mmap.html) module, which allows you to access the file _as if_ it were loaded into memory. However, this is restricted to files <4GB if you're using 32 bit Python. And it doesn't provide a simple way to iterate backwards over the file line by line. However, see the answer by srohde in the linked duplicate question. – PM 2Ring Jul 30 '16 at 09:23
  • An argument can be made for this not being a duplicate of the cited question: the excepted answer in the alleged duplicate question loads the text file into memory - a key requirement of a correct answer to my question is that the file is not loaded into memory. The OP of the alleged duplicate did not have this requirement, and accepted an answer accordingly that does not satisfy the requirements of this ques. Although srohde may have provided an answer that would correctly answer the question I've asked, I don't think it can be said that this question and that question are 'exact duplicates' – Ryan Jul 30 '16 at 10:19
  • @Ryan The "dupe target" of a question closed as a duplicate doesn't have to be an _exact_ duplicate, but one or more of the answers at the dupe target must be applicable to the new question. Also bear in mind that the accepted answer to a question is not necessarily the best answer on the page; after all, the OP may not have sufficient expertise in the topic to be able to judge the answers accurately. Note that srohde's answer has a higher score than the accepted answer, however, the scoring system isn't perfect, and an answer's score should be treated as a guide rather than as absolute data. – PM 2Ring Jul 30 '16 at 11:45
  • The SO philosophy is to concentrate answers to a given question in the one place rather than scattering them over multiple pages. So if someone has a better solution than srohde's answer they are free to post it at the linked page. – PM 2Ring Jul 30 '16 at 11:52

0 Answers0