How can I find out the location of the file cursor when iterating over a file in Python3?
In Python 2.7 it's trivial, use tell()
. In Python3 that same call throws an OSError
:
Traceback (most recent call last):
File "foo.py", line 113, in check_file
pos = infile.tell()
OSError: telling position disabled by next() call
My use case is making a progress bar for reading large CSV files. Computing a total line count is too expensive and requires an extra pass. An approximate value is plenty useful, I don't care about buffers or other sources of noise, I want to know if it'll take 10 seconds or 10 minutes.
Simple code to reproduce the issue. It works as expected on Python 2.7, but throws on Python 3:
file_size = os.stat(path).st_size
with open(path, "r") as infile:
reader = csv.reader(infile)
for row in reader:
pos = infile.tell() # OSError: telling position disabled by next() call
print("At byte {} of {}".format(pos, file_size))
This answer https://stackoverflow.com/a/29641787/321772 suggests that the problem is that the next()
method disables tell()
during iteration. Alternatives are to manually read line by line instead, but that code is inside the CSV module so I can't get at it. I also can't fathom what Python 3 gains by disabling tell()
.
So what is the preferred way to find out your byte offset while iterating over the lines of a file in Python 3?