0

I am trying to write a function that counts the number of characters in a text file and returns the result. I have the following code;

def file_size(filename):
    """Function that counts the number of characters in a file"""
    filename = "data.txt"
    with open(filename, 'r') as file:
        text = file.read()
        len_chars = sum(len(word) for word in text)
        return len_chars

This seemed to be working fine in my IDE when I test ran it with a text file that I created. However when I submit the code to a doctest program I get an error saying it always gives the output of 10. Any help?

Attached is a screenshot of the error message Error screen.

Chris_Rands
  • 38,994
  • 14
  • 83
  • 119
  • Do you want to count unique characters? – cs95 Aug 31 '17 at 08:15
  • 2
    Possible duplicate of [counting characters and lines from a file python 2.7](https://stackoverflow.com/questions/14416522/counting-characters-and-lines-from-a-file-python-2-7) – ziMtyth Aug 31 '17 at 08:16
  • 2
    You are counting the same thing every time... `filename = "data.txt"` – juanpa.arrivillaga Aug 31 '17 at 08:17
  • 1
    BTW: `len_chars = sum(len(word) for word in text)` is a over-engineered and I don't think is doing what you think it is doing, going by your names. You see, iterating over a string like you do: `for word in text:` *iterates over the characters*, and doesn't split on word-boundaries just because you name the iterator variable `word`. So you see, there is *no need* to call `len` on each `word` then `sum` it... since `len` is always `1`. But you could have just called `len(text)` and have gotten the same answer! – juanpa.arrivillaga Aug 31 '17 at 08:21

4 Answers4

4

You don't use the argument of the function but overwrite filename with the constant "data.txt":

def file_size(filename):
    """Function that counts the number of characters in a file"""
    with open(filename, 'r') as file:
        return len(file.read())
Daniel
  • 42,087
  • 4
  • 55
  • 81
1

Super efficient solution for ASCII files (runs in theta(1)):

import os
print(os.stat(filename).st_size)
Sam Chats
  • 2,271
  • 1
  • 12
  • 34
  • 1
    Not necessarily what you want, unless you assume the file is in some 8-bit encoding, like ascii. At least, it won't be equivalent to the OPs answer. To be equivalent, the file would need to be in `rb` mode. – juanpa.arrivillaga Aug 31 '17 at 08:32
  • What juanpa said. This counts bytes, not chars, so it won't always give the correct number, eg if the file is encoded as UTF-8 and it contains non-ASCII chars. – PM 2Ring Aug 31 '17 at 08:38
0

If you just want the file size of an ASCII file, you should use os.stat :

import os

def file_size(filename):
    st = os.stat(filename)
    return st.st_size

The big advantage with this function is that there's not need to read the file. Python simply asks the filesystem for the file size.

Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
0

You could use sum() with a generator expression around iter(partial(f.read, 1), ''), taking inspiration from this answer:

from functools import partial

def num_chars(filename):
    """Function that counts the number of characters in a file"""
    with open(filename) as f:
        return sum(1 for _ in iter(partial(f.read, 1), ''))

The main advantage of this is approach compared to using f.read() is that it is lazy, so you don't read the whole file into memory.

Chris_Rands
  • 38,994
  • 14
  • 83
  • 119