0

Possible Duplicate:
how to get the number of occurrences of each character using python

What is the best way to obtain the count of each character in a string and store it(I'm using a dictionary for this - can this choice make a big difference?)? A couple of ways that I thought of:

1.

for character in string:
    if character in characterCountsDict:
        characterCountsDict[character] += 1
    else:
        characterCountsDict[character] = 1

2.

character = 0
while character < 127:
    characterCountsDict[str(unichr(character))] = string.count(str(unichr(character))
    character += 1

I think the second method is better... But is either of them good? Is there a much better way to do this?

Community
  • 1
  • 1
Jayanth Koushik
  • 9,476
  • 1
  • 44
  • 52

2 Answers2

10
>>> from collections import Counter
>>> Counter("asdasdff")
Counter({'a': 2, 's': 2, 'd': 2, 'f': 2})

Note that you can use Counter object like a dict.

defuz
  • 26,721
  • 10
  • 38
  • 60
2

If you're interested in the most efficient way, it appears to be like this:

from collections import defaultdict

def count_chars(s):
    res = defaultdict(int)
    for char in s:
        res[char] += 1
    return res

Timings:

from collections import Counter, defaultdict

def test_counter(s):
    return Counter(s)

def test_get(s):
    res = {}
    for char in s:
        res[char] = res.get(char, 0) + 1
    return res

def test_in(s):
    res = {}
    for char in s:
        if char in res:
            res[char] += 1
        else:
            res[char] = 1
    return res

def test_defaultdict(s):
    res = defaultdict(int)
    for char in s:
        res[char] += 1
    return res


s = open('/usr/share/dict/words').read()
#eof

import timeit

test = lambda f: timeit.timeit(f + '(s)', setup, number=10)
setup = open(__file__).read().split("#eof")[0]
results = ['%.4f %s' % (test(f), f) for f in dir() if f.startswith('test_')]
print  '\n'.join(sorted(results))

Results:

0.8053 test_defaultdict
1.3628 test_in
1.6773 test_get
2.3877 test_counter
georg
  • 211,518
  • 52
  • 313
  • 390