The basic iteration over the lines of a file like this:
with open(filename) as f:
for line in f:
do_stuff(line)
This actually reads only the current line into memory and not more. If you want to have fine grained control over the buffer size I suggest you use io.open
instead (for example, when your lines are all the same length, this might be useful).
If the operation on your data is actually not IO bound but CPU bound, it might be useful to use multiprocessing:
import multiprocessing
pool = multiprocessing.Pool(8) # play around for performance
with open(filename) as f:
pool.map(do_stuff, f)
This does not speed up the actual reading but might improve performance on processing the lines.