I have a very large text file (several GB in size) which I need to read into Python and then process line by line.
One approach would be to simply call data=f.readlines()
and then process the content. With that approach I know the total number of lines and can easily measure the progress of my processing. This however is probably not the ideal approach given the file size.
The alternative (and I think better) option would be to say:
for line in f:
do something
Just now I am not sure how to measure my progress anymore. Is there a good option that does not add a huge overhead? (One reson why I may want to know the progress is for one to have a rough indicator of the remaining time, as all lines in my file have similar sizes, and to ascertain whether my script is still doing something or has gotten stuck somewhere.)