I have a 25GB file I need to process. Here is what I'm currently doing, but it takes an extremely long time to open:
collection_pricing = os.path.join(pricing_directory, 'collection_price')
with open(collection_pricing, 'r') as f:
collection_contents = f.readlines()
length_of_file = len(collection_contents)
for num, line in enumerate(collection_contents):
print '%s / %s' % (num+1, length_of_file)
cursor.execute(...)
How could I improve this?