Python basic program to count the letters in a file

Question

I'm writing a program in Python for an online class in order to find the frequency of letters in a file. Thing is I keep getting spaces included in the final result too. How can I omit them? Here's my code:

import string
name = raw_input('Enter a file name: ')
fhandle = open(name)
counts = dict()
for line in fhandle:
    line = line.strip()
    line = line.translate(None,string.punctuation)
    line = line.lower()
    letters = list(line)
    for letter in letters:
        counts[letter]=counts.get(letter,0)+1
 lst = list()
    for letter,count in counts.items():
        lst.append((count,letter))
lst.sort(reverse=True)
for count,letter in lst:
    print count,letter

You have good summary of different methods to remove whitespaces, EOL (end of line), tabs etc, here: http://stackoverflow.com/questions/8270092/python-remove-all-whitespace-in-a-string — Tom, Jul 02 '16 at 10:09

akhilmd · Accepted Answer · 2016-07-02T12:28:17.130

4

string.punctuation contains !"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~ and no whitespace characters.

You should change your call to translate() to the following:

line.translate(None,string.punctuation+string.whitespace+string.digits)

Type help(string) in the python interpreter for more information.

edited Jul 02 '16 at 12:28

answered Jul 02 '16 at 10:15

akhilmd

182
1
8

2

Warning: adding `string.digits` to the list means that digits are also removed – Alastair McCormack Jul 02 '16 at 10:18

score 0 · Answer 2 · answered Jul 02 '16 at 10:16

0

If you don't want to print the letter if it is a blank space (and you don't want to change anything else in your code), then you can add one if statement in the last for loop:

for count,letter in lst:
    if letter != ' ':
        print count,letter

answered Jul 02 '16 at 10:16

Mukherjee

486
1
3
11

score 0 · Answer 3 · answered Jul 02 '16 at 10:20

0

An elegant way to do this is to just use isalpha(). See line 11:

import string
name = raw_input('Enter a file name: ')
fhandle = open(name)
counts = dict()
for line in fhandle:
    line = line.strip()
    line = line.translate(None,string.punctuation)
    line = line.lower()
    letters = list(line)
    for letter in letters:
        if letter.isalpha() == True:
            counts[letter]=counts.get(letter,0)+1
    lst = list()
    for letter,count in counts.items():
        lst.append((count,letter))
lst.sort(reverse=True)
for count,letter in lst:
    print count,letter

answered Jul 02 '16 at 10:20

Jaxian

1,146
7
14

What about unicode? See [Python isalpha() and scandics](http://stackoverflow.com/questions/4286637/python-isalpha-and-scandics) – Peter Wood Jul 02 '16 at 10:23
OP said he was looking for "the frequency of letters in a file" and this does precisely that with just one additional line of code. – Jaxian Jul 02 '16 at 10:27
If the file is Unicode `isalpha` can fail to recognise characters. Also `if letter.isalpha() == True:` can be `if letter.isalpha():`, although still won't work as the line needs decoding first. See the linked question. – Peter Wood Jul 02 '16 at 10:33
I've tested that code I submitted multiple times with multiple text files and it works fine. – Jaxian Jul 02 '16 at 10:36
[*'Testing shows the presence, not the absence of bugs'*](https://en.wikiquote.org/wiki/Edsger_W._Dijkstra#1960s). It would fail with a file containing `äöå`. – Peter Wood Jul 02 '16 at 10:37
I understand that, but the OP did not specify looking for characters outside the latin alphabet, so I put together the best solution for that situation. – Jaxian Jul 02 '16 at 10:41
1

If they didn't specify it, why assume? You can ask questions in comments on the original question, or describe the limitations and assumptions you've made in your answer. – Peter Wood Jul 02 '16 at 10:42

Python basic program to count the letters in a file

3 Answers3