2

I have this string: s = "china construction bank". I want to create a function that returns the 3 most frequent characters and order them by their frequency of appearance and the number of times they appear, but if 2 characters appears the same number of times, they should be ordered based on their alphabetical order. I also want to print each character in a separate line.

I have built this code by now:

from collections import Counter
def ordered_letters(s, n=3):
    ctr = Counter(c for c in s if c.isalpha())
    print ''.join(sorted(x[0] for x in ctr.most_common(n)))[0], '\n', ''.join(sorted(x[0] for x in ctr.most_common(n)))[1], '\n', ''.join(sorted(x[0] for x in ctr.most_common(n)))[2]`

This code applied to the above string will yield:

a 
c 
n

But this is not what i really want, what i would like as output is:

1st most frequent: 'n'. Appearances: 4
2nd most frequent: 'c'. Appearances: 3
3rd most frequent: 'a'. Appearances: 2

I'm stuck in the part where i have to print in alphabetical order the characters which have the same frequencies. How could i do this?

Thank you very much in advance

jpp
  • 159,742
  • 34
  • 281
  • 339
Miguel 2488
  • 1,410
  • 1
  • 20
  • 41

4 Answers4

2

You can use heapq.nlargest with a custom sort key. We use -ord(k) as a secondary sorter to sort by ascending letters. Using a heap queue is better than sorted as there's no need to sort all items in your Counter object.

from collections import Counter
from heapq import nlargest

def ordered_letters(s, n=3):
    ctr = Counter(c.lower() for c in s if c.isalpha())

    def sort_key(x):
        return (x[1], -ord(x[0]))

    for idx, (letter, count) in enumerate(nlargest(n, ctr.items(), key=sort_key), 1):
        print('#', idx, 'Most frequent:', letter, '.', 'Appearances:', count)

ordered_letters("china construction bank")

# 1 Most frequent: n . Appearances: 4
# 2 Most frequent: c . Appearances: 3
# 3 Most frequent: a . Appearances: 2
jpp
  • 159,742
  • 34
  • 281
  • 339
1

You can sort c.most_common() with a custom key that considers descending order of frequency first and then the alphabetical order second (note lambda x: (-x[1], x[0])):

from collections import Counter

def ordered_letters(s, n=3):
    c = Counter(s.replace(' ', ''))
    top_n = sorted(c.most_common(), key=lambda x: (-x[1], x[0]))[:n]
    for i, t in enumerate(top_n):
        c, f = t
        if i == 0: print('1st most frequent', c + '.', 'Appearances:', f)
        elif i == 1: print('2nd most frequent', c + '.', 'Appearances:', f)
        elif i == 2: print('3rd most frequent', c + '.', 'Appearances:', f)
        else: print(str(i + 1) + 'th most frequent', c + '.', 'Appearances', f)

sent = "china construction bank"
ordered_letters(sent, 5)
# 1st most frequent n. Appearances: 4                                                                                                                       
# 2nd most frequent c. Appearances: 3                                                                                                                       
# 3rd most frequent a. Appearances: 2                                                                                                                       
# 4th most frequent i. Appearances 2                                                                                                                        
# 5th most frequent o. Appearances 2  
slider
  • 12,810
  • 1
  • 26
  • 42
1

Sort the tuples from Counter the regular way but with its first argument -- the count itself -- negated. This results in a reverse list, but with the second tuple element sorted alphabetically. Then take the last n items.

from collections import Counter

ordinal = lambda n: "%d%s" % (n,"tsnrhtdd"[(n/10%10!=1)*(n%10<4)*n%10::4])

def ordered_letters(s, n=3):
    ctr = Counter(c for c in s if c.isalpha())
    ctr = sorted(ctr.items(), key=lambda x: (-x[1], x[0]))[:n]
    for index,value in enumerate(ctr):
        print "{:s} most frequent: '{:}'. Appearances: {:}".format(ordinal(index+1),value[0],value[1])

s = "achina aconstruction banck"
ordered_letters(s, n=3)

Result:

1st most frequent: 'a'. Appearances: 4
2nd most frequent: 'c'. Appearances: 4
3rd most frequent: 'n'. Appearances: 4

(Freaky ordinal lambda courtesy of Ordinal numbers replacement)

Jongware
  • 22,200
  • 8
  • 54
  • 100
1

You could use defaultdict to create a dictionary with values set to 0, and increment them any time they are encountered. First you sort alphabetically, then by occurrences. This ensures any values that match are prioritized by alphabetical order.

E.g:

from collections import defaultdict
a = {} 
a = defaultdict(lambda:0,a)

s = "china construction bank"

for letter in s:
    if letter != ' ':
        a[letter] += 1

top_three = sorted(sorted(a.items(), key=lambda x: x[0]), key=lambda x: x[1], reverse=True)[:3]

counter = 0
for letter, occurance in top_three:
    counter += 1
    print(str(counter) + " Most frequent: " + letter + " . Appearances: " + str(occurance))

This gives an output matching what you specified:

1 Most frequent: n . Appearances: 4
2 Most frequent: c . Appearances: 3
3 Most frequent: a . Appearances: 2
PL200
  • 741
  • 6
  • 24