1

It seems that the .tell() method is not very reliable when dealing with text files in python. I am trying to use this method to stand in for an EOF condition as found in other programming languages.

For various reasons, I do not want to iterate over the text file with a FOR loop but use a WHILE loop instead.

Below is some code that replicates the problem. I have included code that will generate a test.txt text file in a random way:

import re
from random import randint


def file_len_lines(f_name):
    with open(f_name) as f:
        for i, l in enumerate(f):
            pass
    return i + 1


def file_len_chars(f_name, with_nls):
    char_count = 0
    with open(f_name) as f:
        for line in f:
            char_count += len(line)
            if with_nls:
                char_count += 1
            else:
                pass
    return char_count


def trim(sut):
    return re.sub(' +', ' ', sut).strip()


# Create test file
with open("test.txt", "w") as f:
    word_list = ("Betty Eats Cakes And Uncle Sells Eggs "*20).split()
    word_list[3] = ""
    # for num in range(len(word_list)):
    #     if randint(1, 2) == 1:
    #         word_list[num] = ""
    for word in word_list:
        print(word, file=f)

file_to_read = 'test.txt'
# file_to_read = 'Fibonacci Tree 01.log'


with open(file_to_read, "r") as f:
    count = 0
    file_length = file_len_chars(file_to_read, True)
    file_length_lines = file_len_lines(file_to_read)
    print(f"Lines in file = {file_length_lines}, Characters in file = {file_length}")
    f.seek(0)
    while f.tell() < file_length:
        count += 1
        text_line = f.readline()
        print(f"Line = {count}, ", end="")
        print(f"Tell = {f.tell()}, ", end="")
        print(f"Length {len(text_line)} ", end="")
        if text_line in ['', '\n']:
            print(count)
        elif trim(text_line).upper()[0] in "A E I O U".split():
            print(text_line, end='')
        else:
            print(count)

This code should always output something like:

Lines in file = 140, Characters in file = 897
Line = 1, Tell = 7, Length 6 1
Line = 2, Tell = 13, Length 5 Eats
Line = 3, Tell = 20, Length 6 3
...
Line = 138, Tell = 884, Length 6 Uncle
Line = 139, Tell = 891, Length 6 139
Line = 140, Tell = 897, Length 5 Eggs

Process finished with exit code 0

but instead, it mostly outputs something more like:

Lines in file = 140, Characters in file = 605
Line = 1, Tell = 7, Length 6 1
Line = 2, Tell = 18446744073709551630, Length 5 Eats

Process finished with exit code 0

You can see that on the last line of the output above, the .tell() method output went haywire and does not cycle through all 140 lines.

I am looking for a way to make the .tell() method behave or otherwise detect an EOF condition in another way to make break a WHILE loop.

Again, most advice found online says "iterate with a FOR loop". I do not want to do this for various reasons that are tedious to explain. (Briefly, it will make the nature of my original code very unwieldy due to the nested flow diagram I intend to follow.)

  • As the docs state the number returned by `TextIOBase.tell()` is opaque and "does not usually represent a number of bytes in the underlying binary storage". Possible solution is to use a binary file and convert each line afterwards (be aware of the line ending). – Michael Butscher May 19 '19 at 08:36
  • 2
    Possible duplicate of [Python file.tell() giving strange numbers?](https://stackoverflow.com/questions/15934950/python-file-tell-giving-strange-numbers) – Paul Cornelius May 19 '19 at 08:37
  • This is a known limitation. I don't know what you want to do, but the tell() function almost certainly isn't going to work for you. – Paul Cornelius May 19 '19 at 08:39
  • Thanks @Paul Cornelius, I realise this and asked anyway. Was hoping that somebody out there had since figured out a workaround to check for an EOF condition. Used a for loop in the meantime but it makes for some cumbersome repetitive code. – Lance Skelly May 20 '19 at 10:49
  • There are various ways to detect an EOF condition. Read streams do it automatically, and their various read functions return an empty string if the file is at EOF. As I said I don't know what you are trying to do, but reading/writing files is one of the most common operations and the Python standard library is designed to make it easy, without having to resort to the kind of low-level messing around you are doing here. – Paul Cornelius May 20 '19 at 18:05

0 Answers0