How to stop 's' from re-occurring twice in my loop?

Question

So this is my code:

with open('cipher.txt') as f:
  f = f.read().replace(' ', '')

new = []
for i in f:
    new = sorted([i + ' ' + str(f.count(i)) for i in f])
for o in new:
  print(o)

This is the text file:

xli uymgo fvsar jsb

It's supposed to get each letter used and print them before the amount of times they are used, in alphabetical order, but what I don't want is the letter 's' (or any letter that has a .count() of 2) will repeat twice, but i only want it to repeat once, how can I do this?

This is what i'm getting:

a 1
b 1
f 1
g 1
i 1
j 1
l 1
m 1
o 1
r 1
s 2
s 2
u 1
v 1
x 1
y 1

But this is what I want:

a 1
b 1
f 1
g 1
i 1
j 1
l 1
m 1
o 1
r 1
s 2
u 1
v 1
x 1
y 1

Martijn Pieters · Answer 1 · 2013-09-02T09:14:22.613

1

You are looking for collections.Counter() instead:

from collections import Counter

with open('cipher.txt') as f:
    new = Counter(f.read().replace(' ', ''))

for letter, count in new.most_common():
   print(letter, count)

or, alternatively printing the letters in sorted order:

for letter in sorted(new):
   print(letter, new[letter])

Counter.most_common() sorts the results by counts, descending. sorted(new) on the other hand returns a sorted list of the keys of the Counter dictionary, so that version more closely matches your attempted output.

Your code instead used f.count(i) to count each letter every time you encountered it. You'd normally use a dictionary to track counts and avoid using the full scan of str.count():

counts = {}
for letter in f:
    counts[letter] = counts.get(letter, 0) + 1

for letter in sorted(new):
   print(letter, new[letter])

edited Sep 02 '13 at 09:14

answered Sep 02 '13 at 09:05

Martijn Pieters

1,048,767
296
4,058
3,343

Don't think OP wants to import: http://stackoverflow.com/questions/18568309/sorting-results/18568345#18568345 – TerryA Sep 02 '13 at 09:07
Only thing to change would be to do `sorted(new.keys())` and iterate over that in order to print in alpha order. May also want to discard punctuation characters, but the OP's question does not make it clear whether they would exist. – sberry Sep 02 '13 at 09:07
@sberry: Why `new.keys()` when `sorted(new)` suffices? I was already editing that part in, but the biggest problem was that the OP was using a list comprehension where a dictionary was needed. – Martijn Pieters Sep 02 '13 at 09:09
Right you are about not needing to call keys, though I assume the OP might miss the fact that iterating over a dictionary iterated over the keys. – sberry Sep 02 '13 at 09:11

Burhan Khalid · Answer 2 · 2013-09-02T09:11:32.397

The easy way to do this is with collections.Counter:

from collections import Counter

s = "xli uymgo fvsar jsb"

for letter,count in Counter((i for i in s if i != ' ')).iteritems():
   print letter, count

To solve your problem, you can convert the list to a set, or use a defaultdict. Here is the defaultdict implementation:

from collections import defaultdict

d = defaultdict(int)

for i in f:
    d[i] += 1

for k in sorted(d.keys()):
   print k, d[k]

The defaultdict implemenation is also handy if you are unable to use Counter (its for 2.7+)

score 1 · Answer 3 · answered Sep 02 '13 at 09:06

In order to get a count of the number of times each character appears in your text file, you should use the following code:

from collections import Counter

def get_char_count_from_file(file_path):
    with open(file_path) as f:
        return Counter(f.read())

Example:

>>> get_char_count_from_file('C:/Python27/README.txt')
Counter({' ': 10634, 'e': 4067, 't': 3269, 'i': 2799, 'o': 2791, 'n': 2438, 's': 2307, 'a': 2283, 'r': 2183, 'l': 1848, 'h': 1469, 'u': 1278, '\n': 1229, 'd': 1225, 'c': 1196, '-': 1116, 'p': 969, 'm': 899, 'f': 846, 'y': 791, '.': 770, 'b': 697, 'g': 672, 'w': 488, ',': 408, '/': 326, 'k': 288, 'v': 286, 'T': 250, 'S': 223, 'P': 212, 'I': 198, 'C': 191, 'x': 177, '"': 176, ')': 176, '(': 162, '=': 125, ':': 119, 'O': 115, 'E': 108, 'D': 102, '2': 95, 'R': 95, 'A': 94, 'M': 94, '_': 89, 'N': 85, 'L': 84, "'": 84, '1': 78, 'X': 71, '0': 69, 'U': 65, 'G': 63, '4': 53, 'H': 53, 'B': 49, '3': 48, '+': 44, 'W': 42, 'F': 40, '5': 39, 'q': 36, 'Y': 35, '6': 31, 'z': 30, ';': 25, 'V': 22, 'j': 22, '8': 21, '9': 18, '$': 17, '@': 16, '7': 15, '<': 13, '>': 13, '\\': 11, '!': 11, '*': 10, '{': 8, '}': 8, 'K': 7, '`': 6, 'J': 6, '#': 5, 'Q': 5, '&': 4, '?': 3, 'Z': 3, '~': 3, '[': 2, '\t': 2, ']': 2})

How you can use that:

>>> for k,v in sorted(Counter('xli uymgo fvsar jsb').items()):
    print k, v

  3
a 1
b 1
f 1
g 1
i 1
j 1
l 1
m 1
o 1
r 1
s 2
u 1
v 1
x 1
y 1

NPE · Accepted Answer · 2013-09-02T09:43:05.153

1

I'd use collections.Counter for this:

import collections

s = 'xli uymgo fvsar jsb'
cnt = collections.Counter(s.replace(' ', ''))
for letter in sorted(cnt):
  print (letter, cnt[letter])

This prints out

a 1
b 1
f 1
g 1
i 1
j 1
l 1
m 1
o 1
r 1
s 2
u 1
v 1
x 1
y 1

edited Sep 02 '13 at 09:43

answered Sep 02 '13 at 09:07

NPE

486,780
108
951
1,012

Another point: the OP tagged this as Python 3, but your `print` statement won't work on 3. – Martijn Pieters Sep 02 '13 at 09:22

score 1 · Answer 5 · answered Sep 02 '13 at 09:08

1

with open('cipher.txt') as f:
   f = f.read().replace(' ', '')

new = set()
for i in f:
    new = set(sorted([i + ' ' + str(f.count(i)) for i in f]))
for o in new:
print(o)

answered Sep 02 '13 at 09:08

TangbiaoJiujiu

41
4

How to stop 's' from re-occurring twice in my loop?

5 Answers5