Let's consider a big file (~100MB). Let's consider that the file is line-based (a text file, with relatively short line ~80 chars).
If I use built-in open()
/file()
the file will be loaded in lazy manner.
I.E. if a I do aFile.readline()
only a chunk of a file will reside in memory. Does the urllib.urlopen() do something similar (with usage of a cache on disk)?
How big is the difference in performance between urllib.urlopen().readline()
and file().readline()
? Let's consider that file is located on localhost. Once I open it with urllib.urlopen()
and then with file()
. How big will be difference in performance/memory consumption when i loop over the file with readline()
?
What is best way to process a file opened via urllib.urlopen()
? Is it faster to process it line by line? Or shall I load bunch of lines(~50) into a list and then process the list?