I am currently reading in a large csv file (around 100 million lines), using command along the lines of that described in https://docs.python.org/2/library/csv.html e.g. :
import csv
with open('eggs.csv', 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in spamreader:
process_row(row)
This is proving rather slow, I suspect because each line is read in individually (requiring lots of read calls to the hard drive). Is there any way of reading the whole csv file in at once, and then iterating over it? Although the file itself is large in size (e.g. 5Gb), my machine has sufficient ram to hold that in memory.