36

I am creating a file editing system and would like to make a line based tell() function instead of a byte based one. This function would be used inside of a "with loop" with the open(file) call. This function is part of a class that has:

self.f = open(self.file, 'a+')
# self.file is a string that has the filename in it

The following is the original function (It also has a char setting if you wanted line and byte return):

def tell(self, char=False):
    t, lc = self.f.tell(), 0
    self.f.seek(0)
    for line in self.f:
        if t >= len(line):
            t -= len(line)
            lc += 1
        else:
            break
    if char:
        return lc, t
    return lc

The problem I'm having with this is that this returns an OSError and it has to do with how the system is iterating over the file but I don't understand the issue. Thanks to anyone who can help.

Jonas
  • 121,568
  • 97
  • 310
  • 388
Brandon H. Gomes
  • 808
  • 2
  • 9
  • 24
  • Hard to answer without seeing the rest of your class. (I couldn't reproduce it on Linux using only functions.) You might want to read up on [`OSError`'s attributes](https://docs.python.org/3/library/exceptions.html#OSError), which can give you (and us) some additional information. My first question would be, since this is an _OS_ error: What's your operating system? Also (possibly related): Why / how are you [opening the file in append mode](https://docs.python.org/3/library/functions.html#open) and then `seek`ing around inside it? – Kevin J. Chase Apr 14 '15 at 05:03
  • I'm opening it in append mode because, it is assumed that the file is non-existant before the instance is created. (as you know, I'm sure, 'a' mode creates the file if it doesn't exist yet). I wanted to be able to save space in the code to have a check if the file existed. My operating system is Mac OS X Yosemite, but I don't think it has to do with Apple. – Brandon H. Gomes Apr 15 '15 at 01:33

5 Answers5

55

I don't know if this was the original error but you can get the same error if you try to call f.tell() inside of a line-by-line iteration of a file like so:

with open(path, "r+") as f:
  for line in f:
    f.tell() #OSError

which can be easily substituted by the following:

with open(path, mode) as f:
  line = f.readline()
  while line:
    f.tell() #returns the location of the next line
    line = f.readline()
Héctor
  • 669
  • 1
  • 5
  • 8
23

I have an older version of Python 3, and I'm on Linux instead of a Mac, but I was able to recreate something very close to your error:

IOError: telling position disabled by next() call

An IO error, not an OS error, but otherwise the same. Bizarrely enough, I couldn't cause it using your open('a+', ...), but only when opening the file in read mode: open('r+', ...).

Further muddling things is that the error comes from _io.TextIOWrapper, a class that appears to be defined in Python's _pyio.py file... I stress "appears", because:

  1. The TextIOWrapper in that file has attributes like _telling that I can't access on the whatever-it-is object calling itself _io.TextIOWrapper.

  2. The TextIOWrapper class in _pyio.py doesn't make any distinction between readable, writable, or random-access files. Either both should work, or both should raise the same IOError.

Regardless, the TextIOWrapper class as described in the _pyio.py file disables the tell method while the iteration is in progress. This seems to be what you're running into (comments are mine):

def __next__(self):
    # Disable the tell method.
    self._telling = False
    line = self.readline()
    if not line:
        # We've reached the end of the file...
        self._snapshot = None
        # ...so restore _telling to whatever it was.
        self._telling = self._seekable
        raise StopIteration
    return line

In your tell method, you almost always break out of the iteration before it reaches the end of the file, leaving _telling disabled (False):

One other way to reset _telling is the flush method, but it also failed if called while the iteration was in progress:

IOError: can't reconstruct logical file position

The way around this, at least on my system, is to call seek(0) on the TextIOWrapper, which restores everything to a known state (and successfully calls flush in the bargain):

def tell(self, char=False):
    t, lc = self.f.tell(), 0
    self.f.seek(0)
    for line in self.f:
        if t >= len(line):
            t -= len(line)
            lc += 1
        else:
            break
    # Reset the file iterator, or later calls to f.tell will
    # raise an IOError or OSError:
    f.seek(0)
    if char:
        return lc, t
    return lc

If that's not the solution for your system, it might at least tell you where to start looking.

PS: You should consider always returning both the line number and the character offset. Functions that can return completely different types are hard to deal with --- it's a lot easier for the caller to just throw away the value her or she doesn't need.

Kevin J. Chase
  • 3,856
  • 4
  • 21
  • 43
  • 1
    Thanks so much for your help! What seems to be my problem is that i can't call the (built-in) tell() method during a file iteration (line by line). I found a way around this and your answer really helped. Thanks again! – Brandon H. Gomes Apr 16 '15 at 01:52
  • @BrandonGomes: would you mind sharing your solution with me? – marscher Feb 18 '16 at 19:16
  • 1
    sorry @marscher I don't have this code anymore. It's from an old computer. I think the answer was to store some meta-data about the file iterator. You could always rewrite the __next__ function. – Brandon H. Gomes Mar 03 '16 at 04:31
  • To add some color to this answer, the place where this happens in cpython is [here](https://github.com/python/cpython/blob/80b714835d6f5e1cb8fbc486f9575b5eee9f007e/Lib/_pyio.py#L2328). As to why it happens, it's probably because due to caching the actual location isn't 100% accurate. However, it can still be useful e.g. when loading a large jsonl file. In that case, just use `f.buffer.tell()`. – heiner May 01 '23 at 21:55
8

Just a quick workaround for this issue:

As you are iterating over the file from the beginning anyways, just keep track of where you are with a dedicated variable:

file_pos = 0
with open('file.txt', 'rb') as f:
    for line in f:
        # process line
        file_pos += len(line)

Now file_pos will always be, what file.tell() would tell you. Note that this only works for ASCII files as tell and seek work with byte positions. Working on a line-basis it's easy though to convert strings from byte to unicode-strings.

moritzschaefer
  • 681
  • 1
  • 8
  • 18
  • 1
    In py3 thanks to 'rb' line is what you'd expect (including line terminators as in `\r\n`) - so this works fine for rewinding to start of line - nifty! – Mr_and_Mrs_D Jan 26 '21 at 12:45
4

I had the same error: OSError: telling position disabled by next() call, and solved it by adding the 'rb' mode while opening the file.

Ann Guseva
  • 55
  • 5
1

The error message is pretty clear, but missing one detail: calling next on a text file object disables the tell method. A for loop repeatedly calls next on iter(f), which happens to be f itself for a file. I ran into a similar issue trying to call tell inside the loop instead of calling your function twice.

An alternative solution is to iterate over the file without using the built-in file iterator. Instead, you can bake a nearly equally efficient iterator from the arcane two-arg form of the iter function:

for line in iter(f.readline, ''):
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264