2

I have a Python script which needs to read a section of a very large text file, starting at line N and ending at N+X. I don't want to use "open('file')", because that will write the entire thing to the memory, which will both take too long, and waste too much memory. My script runs on a Unix machine, so I currently use the native head and tail functions, i.e.:

section = subprocess.check_output('tail -n-N {filePath} | head -n X')

but is feels like there must be a smarter way of doing it.. is there a way to get lines N through N+X of a text file in Python without opening the entire file?

Thanks!

shayelk
  • 1,606
  • 1
  • 14
  • 32
  • *“I don't want to use "open('file')", because that will write the entire thing to the memory, which will both take too long, and waste too much memory.”* That is not what `open` does; use it. – Ry- Feb 23 '17 at 08:41

2 Answers2

3

The answer to your question is located here: How to read large file, line by line in python

with open(...) as f:
    for line in f:
        <do something with line>

The with statement handles opening and closing the file, including if an exception is raised in the inner block. The for line in f treats the file object f as an iterable, which automatically uses buffered IO and memory management so you don't have to worry about large files.

Community
  • 1
  • 1
Dwaxe
  • 109
  • 1
  • 6
3

Python's islice() works well for doing this:

from itertools import islice

N = 2
X = 5

with open('large_file.txt') as f_input:
    for row in islice(f_input, N-1, N+X):
        print row.strip()

This skips over all of the initial lines and just returns the lines you are interested in.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97