4

I am trying to use f.tell() in a normal text file during iteration:

with open('test.txt') as f:
    for line in f:
        print(f.tell())

I get the following error:

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
OSError: telling position disabled by next() call

Just to make sure, I checked that the same error occurs if I try skipping a line manually, discarding the iterator object (which is probably the file itself):

with open('test.txt') as f:
    next(f)
    print(f.tell())

My end goal is to find the length of the first line in a file in bytes, regardless of platform, so the following works just fine:

with open('test.txt') as f:
    f.readline()
    print(f.tell())

I am curious as to why using tell is disabled during iteration. I can understand why seek would be disabled, given that most iterators don't like concurrent modification, but why tell? Does tell perform some state-change that impacts the iterator or something like that perhaps?

I should probably mention that I am running Python 3.6.2 in an Anaconda environment. I have observed this behavior on both Arch Linux and Red Hat 7.5.

Update

This issue seems to appear in Python 2.7 in a different form: file.tell() inconsistency. I wonder if the inconsistency caused by the buffering optimization is the reason tell is disabled completely in Python 3.

This actually brings up a deeper question, which is why does the OS-level file pointer gets returned by tell at all, when the goal of the Python file interface is to abstract away from that? It's not like the position of the Python level pointer is ambiguous or mysterious, with or without buffering.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • I don't understand this: *My end goal is to find the length of line in a file containing lines of equal length*, would you please explain? May be examples would be good. – Hai Vu Jan 02 '18 at 06:29
  • Thanks for the inconsistency pointer -- I guess that helps me understand why Python 3 refuses to `seek` on a text-mode file. The obvious workaround is to open the file in binary mode, which also solves your case. Is this an acceptable solution? You can obviously `.decode('utf-8')` the bytes inside the loop yourself if you need to. – tripleee Jan 02 '18 at 06:41
  • 1
    For context, I guess this is related to https://stackoverflow.com/a/48055805/874188 – tripleee Jan 02 '18 at 06:41
  • @HaiVu. It's actually quite irrelevant what I'm trying to do. The question came up in the following answer: https://stackoverflow.com/a/48055805/2988730. I've changed this question to raise fewer distractions. – Mad Physicist Jan 02 '18 at 13:30
  • @triplee. Yup, that's exactly how the question came up. Well correlated. – Mad Physicist Jan 02 '18 at 13:31
  • @triplee. I haven't had any problems `seek`ing within a text file in Python 3. You run the risk of not landing on a character boundary, resulting in a `DecodingError`, but that's normal and expected. Disabling `seek` during iteration is also sensible and somewhat expected since most iterators don't like concurrent modification. The surprise here is that `tell` is ever disabled. – Mad Physicist Jan 02 '18 at 13:39
  • Sorry for being imprecise. The [documentation](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) says *"In text files (those opened without a `b` in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking to the very file end with `seek(0, 2)`) and the only valid offset values are those returned from the `f.tell()`, or zero. Any other offset value produces undefined behaviour."* – tripleee Jan 02 '18 at 13:45
  • Itertating over the lines with the ability to call `tell()` within the loop can be done like this: `for line in iter(f.readline, ''):` – BlackJack Jan 02 '18 at 16:27
  • @BlackJack. That is true (and very useful if you need to), but I am looking for the explanation as to why such a gimmick is necessary in the first place. – Mad Physicist Jan 02 '18 at 16:59

0 Answers0