Python: How do I ignore non letters in a string?

Question

The function prints out the individual frequency for letters in a file but I cant get it to ignore non letters, I only want it to count letters when working out the percentage frequency of each letter. This what I have so far:

from string import ascii_lowercase as lowercase

def calcFrequencies(file):
    """Enter file name in quotations. Shows the frequency of letters in a file"""
    infile = open(file)
    text = infile.read()
    text = text.lower()

    text_length = len(text)
    counts = [0]*26

    for i in range(26):
        char=lowercase[i]
        counts[i] = 100*text.count(char)/text_length
        print("{:.1f}% of the characters are '{}'".format(counts[i],char))
    infile.close()

score 3 · Answer 1 · answered Jan 30 '14 at 02:32

3

Use filter

>>> text = "abcd1234efg"
>>> filter(str.isalpha, text)
'abcdefg'

answered Jan 30 '14 at 02:32

mhlester

22,781
10
52
75

`filter` is deprecated and is best to avoid. Use List Comprehension or Loops instead – Abhijit Jan 30 '14 at 02:45
1

I wasn't aware. Can you share a link? – mhlester Jan 30 '14 at 02:48
@Abhijit, I haven't found any literature to that effect. Also, in my test filter ran 43% faster than list comprehension in this case. – mhlester Jan 30 '14 at 05:10

jayelm · Accepted Answer · 2014-01-30T02:44:24.620

1

You could use the join method with a list comprehension (faster than a genexp) to reassign the string with only the alphabetic characters before counting:

text = ''.join([char for char in text if char.isalpha()])

edited Jan 30 '14 at 02:44

answered Jan 30 '14 at 02:28

jayelm

7,236
5
43
61

@Foflo just be advised, if you care about speed I believe mhlester's solution is faster. – jayelm Jan 30 '14 at 02:35
I think he was downvoted before my answer. That said it works so the downvote is unwarranted – mhlester Jan 30 '14 at 02:38
I was not the down voter. But you should remember, passing a generator is slower than passing a List to `str.join`. So @aj8uppal 's answer would end up to be faster than yours. – Abhijit Jan 30 '14 at 02:41
@Abhijit I actually didn't know that, could you tell me why? – jayelm Jan 30 '14 at 02:42
2

@jmu303: Check this [SO answer](http://stackoverflow.com/a/9061024/977038) and the Question [List comprehension vs generator expression's weird timeit results?](http://stackoverflow.com/q/11964130/977038) – Abhijit Jan 30 '14 at 02:45

Python: How do I ignore non letters in a string?

2 Answers2