10

This is almost the same question as How to solve "OSError: telling position disabled by next() call". While the older question has received a few answers with useful workarounds, the meaning of the error is not clear. I wonder if anybody can comment on this.

I am learning Python and loosely following a tutorial. I entered the following interactively on Fedora 23:

$ python3
Python 3.4.3 (default, Aug  9 2016, 15:36:17)
[GCC 5.3.1 20160406 (Red Hat 5.3.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> with open("myfile","r") as file:
...     for l in file:
...         print("Next line: \"{}\"".format(l))
...         print("Current position {:4d}".format(file.tell()))

myfile contains a few lines of text. The output:

Next line: "This is line number 0
"
Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
OSError: telling position disabled by next() call

Googling for this error yields a whopping 6 results. The same happens with Python 3.6.4 on Cygwin on Windows 10.

Edit:

The tell() method for text files is documented as follows:

Return the current stream position as an opaque number. The number does not usually represent a number of bytes in the underlying binary storage.

"Opaque number" seems to indicate that I can't just print it. So, I replaced the second print() call with pos = file.tell(). Same result.

martineau
  • 119,623
  • 25
  • 170
  • 301
berndbausch
  • 869
  • 10
  • 18
  • 4
    The "opaqueness" of `tell()` on a file opened in text mode simply means that the result may not match the number of characters you've received from the file, due to end-of-line conversions and whatever other platform-specific differences there might be between text and binary files. All it's good for is `seek()`ing back to the same position later. However, the value is still an ordinary integer, there's nothing keeping you from printing it out if you want. – jasonharper Apr 12 '18 at 00:03

2 Answers2

26

The message means exactly what it says: because you have called next() on the file, the use of tell() on that file has been disabled.

It might not look like you've called next, but the for loop calls it implicitly. A for loop:

for element in thing:
    do_stuff_with(element)

is syntactical sugar for

iterator = iter(thing) # the real implementation doesn't use a variable
while True:
    try:
        element = next(iterator) # here's the next() call
    except StopIteration:
        break
    do_stuff_with(element)

For a file, iter(file) returns the file, and the loop calls next on the file.


As for why calling next disables tell(), this is for efficiency. It only happens for text files (specifically io.TextIOWrapper), which have to do a bunch of extra work to support tell; turning off tell support lets them skip that work. The original commit message for the change that made next disable tell is "Speed up next() by disabling snapshot updating then.", indicating it's for efficiency.

For historical context, previous Python versions used a hidden buffer for next that tell and other file methods didn't account for, causing tell (and other file methods) to produce not-very-meaningful results during iteration over a file. The current IO implementation would be able to support tell() during iteration, but io.TextIOWrapper prevents such calls anyway. The historical incompatibility between next and other methods likely contributed to why it was considered reasonable to disable parts of file functionality during iteration.


You didn't ask for workarounds, but for the benefit of people who end up on this page looking for a workaround, I'll mention that

for line in iter(file.readline, ''):
    ...

will let you iterate over the lines of a text file without disabling tell. (You can use for line in iter(file.readline, b'') for binary files, but there's not much point, because the tell disabling mechanism isn't there for binary files.)

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • The OP isn't explicitly calling `next()`...and the rest of your answer sounds mostly like pure speculation to me. – martineau Apr 12 '18 at 00:54
  • 3
    @martineau: There's no explicit `next` call, but `for` implicitly calls `next`. As for the rest of the answer, it is largely speculative, but the old `next`/other methods incompatibility is documented, and the implementation *is* skipping work when telling is disabled. – user2357112 Apr 12 '18 at 01:02
  • @martineau: The answer should now be much less speculative. – user2357112 Apr 12 '18 at 02:32
  • 1
    That's definitely an improvement, but I think you also ought to at least explain to the OP how their `for l in file:` is causing an implicit call to `next()` because that's plainly not obvious and they don't seem to understand how `file`s and iterators work. – martineau Apr 12 '18 at 03:40
  • 1
    @martineau: Explanation expanded. – user2357112 Apr 12 '18 at 03:51
  • This is heavy on the root-cause and the history of why, but offers no solution. The solution mentioned [here](https://stackoverflow.com/a/42150352/202229) is if you want pushback (or arbitrary `seek/tell`ing) you can't use `for line in f:` to iterate, you must do `line = f.readline()`, `while line` – smci Nov 28 '18 at 02:00
  • @smci: Well, yeah. The questioner already found (and linked) a post with workarounds; they asked specifically for what the error means. I suppose I'll add a bit about a workaround anyway. – user2357112 Nov 28 '18 at 04:05
  • @user2357112 Thank you so much for telling indicating this workaround. It really saved my day. – Alexander Cska May 30 '19 at 19:13
  • Very nice workaround, with very few impact on code structure. Thanks! – Joël Jun 20 '19 at 12:03
  • We can't do this for `csv.reader`, the only thing we can do is `next(reader)` or iteration, any suggestions? – saeedgnu Oct 31 '20 at 14:38
  • @saeedgnu: `csv.reader` objects don't have a `tell` method. Are you talking about calling `tell` on the underlying file? You can pass `iter(file.readline, '')` to the reader to avoid disabling `tell` on the underlying file. – user2357112 Oct 31 '20 at 21:36
  • No, I needed to pass the file object to `csv.reader`, but still use `fileobj.tell()`, I found a solution myself, I just wrapped the file object to keep track of current position, here: https://github.com/ilius/pyglossary/blob/master/pyglossary/plugins/csv_pyg.py#L58 – saeedgnu Nov 01 '20 at 12:31
  • @saeedgnu: It looks like passing `iter(file.readline, '')` to `csv.reader` works for your use case. Your wrapper is dangerous - it's got a whole bunch of methods inherited from `io.TextIOWrapper` that your code doesn't interact properly with, including things like `__exit__`, `read`, and `seek`. It doesn't look like your wrapper has a reason to inherit from TextIOWrapper. – user2357112 Nov 01 '20 at 12:43
  • Thanks, I fixed that and inherit from object. Not dangerous anymore! Passing `iter(fileobj.readline, '')` to `csv.reeder` or calling `readline()` inside `__next__` makes it ~3.3 times slower for me. So why do it? – saeedgnu Nov 01 '20 at 14:11
  • BTW my use case is reading from `.csv.gz` for example, while showing a progress bar to user (`.tell()` is for progress bar) – saeedgnu Nov 01 '20 at 14:13
  • The ratio of running time varies from 3 to 3.5. – saeedgnu Nov 01 '20 at 14:18
  • If you want to `seek` to the first line after ..., be careful: `pos = -1 ; for line in iter( file.readline, '' ): ; prev = pos ; pos = file.tell() ; if ...: break ; topline = line ; file.seek( prev ) # to the start of topline` – denis Oct 17 '22 at 14:01
1

If your text file is too large, there are two solutions according to this answer:

  1. Using file.readline() instead of next()
with open(path, mode) as file:
    while True:
        line = file.readline()
        if not line:
            break
        file.tell()
  1. Using offset += len(line) instead of file.tell()
offset = 0
with open(path, mode) as file:
    for line in file:
        offset += len(line)
Zhou Hongbo
  • 1,297
  • 13
  • 25