8

Python Collection Counter.most_common(n) method returns the top n elements with their counts. However, if the counts for two elements is the same, how can I return the result sorted by alphabetical order?

For example: for a string like: BBBAAACCD, for the "2-most common" elements, I want the result to be for specified n = 2:

[('A', 3), ('B', 3), ('C', 2)]

and NOT:

[('B', 3), ('A', 3), ('C', 2)]

Notice that although A and B have the same frequency, A comes before B in the resultant list since it comes before B in alphabetical order.

[('A', 3), ('B', 3), ('C', 2)]

How can I achieve that?

glibdud
  • 7,550
  • 4
  • 27
  • 37
stfd1123581321
  • 163
  • 1
  • 2
  • 6
  • 1
    Possible duplicate of [How to sort Counter by value? - python](http://stackoverflow.com/questions/20950650/how-to-sort-counter-by-value-python) – justderb Apr 18 '17 at 05:55
  • 1
    @HarshaW no, it's not a duplicate. I just updated my question to clarify what I am trying to achieve. Please review and let me know if you have some thoughts. – stfd1123581321 Apr 18 '17 at 17:35

8 Answers8

4

Although this question is already a bit old i'd like to suggest a very simple solution to the problem which just involves sorting the input of Counter() before creating the Counter object itself. If you then call most_common(n) you will get the top n entries sorted in alphabetical order.

from collections import Counter

char_counter = Counter(sorted('ccccbbbbdaef'))
for char in char_counter.most_common(3):
  print(*char)

resulting in the output:

b 4
c 4
a 1
DJSchaffner
  • 562
  • 7
  • 22
2

There are two issues here:

  1. Include duplicates when considering top n most common values excluding duplicates.
  2. For any duplicates, order alphabetically.

None of the solutions thus far address the first issue. You can use a heap queue with the itertools unique_everseen recipe (also available in 3rd party libraries such as toolz.unique) to calculate the nth largest count.

Then use sorted with a custom key.

from collections import Counter
from heapq import nlargest
from toolz import unique

x = 'BBBAAACCD'

c = Counter(x)
n = 2
nth_largest = nlargest(n, unique(c.values()))[-1]

def sort_key(x):
    return -x[1], x[0]

gen = ((k, v) for k, v in c.items() if v >= nth_largest)
res = sorted(gen, key=sort_key)

[('A', 3), ('B', 3), ('C', 2)]
jpp
  • 159,742
  • 34
  • 281
  • 339
1

I would first sort your output array in alphabetical order and than sort again by most occurrences which will keep the alphabetical order:

from collections import Counter
alphabetic_sorted = sorted(Counter('BBBAAACCD').most_common(), key=lambda tup: tup[0])
final_sorted = sorted(alphabetic_sorted, key=lambda tup: tup[1], reverse=True)
print(final_sorted[:3])

Output:

[('A', 3), ('B', 3), ('C', 2)]
Bono
  • 17
  • 6
0

I would go for:

sorted(Counter('AAABBBCCD').most_common(), key=lambda t: (-t[1], t[0]))

This sorts count descending (as they are already, which should be more performant) and then sorts by name ascending in each equal count group

Mr_and_Mrs_D
  • 32,208
  • 39
  • 178
  • 361
  • But this doesn't include duplicates when user wants "top 2 values", e.g. see [my answer](https://stackoverflow.com/a/53156058/9209546). – jpp Nov 05 '18 at 14:15
  • Oh that wasn't explicitly stated in the question – Mr_and_Mrs_D Nov 05 '18 at 16:19
  • Possibly, but it's implicit in the output. I agree that the question could be more clearly written (I'll try to do that). – jpp Nov 05 '18 at 16:33
0

This is one of the problems I got in the interview exam and failed to do it. Came home slept for a while and solution came in my mind.

from collections import Counter


def bags(list):
    cnt = Counter(list)
    print(cnt)
    order = sorted(cnt.most_common(2), key=lambda i:( i[1],i[0]), reverse=True)
    print(order)
    return order[0][0]


print(bags(['a','b','c','a','b']))
Joe Root
  • 51
  • 5
  • but if you will use more elements in list the result don't sorted by alphabetical order, for example try use, `print(bags(['a','b','c','a','b', 'c', 'c', 'd', 'd']))` and `most_common(3)` sometimes it returns `[('c', 3), ('b', 2), ('a', 2)]` but i always wait `[('c', 3), ('a', 2), ('b', 2)]` – Vadim Mar 26 '19 at 07:55
0
from collections import Counter


s = 'qqweertyuiopasdfghjklzxcvbnm'

s_list = list(s)

elements = Counter(s_list).most_common()

print(elements)
alphabet_sort = sorted(elements, key=lambda x: x[0])
print(alphabet_sort)
num_sort = sorted(alphabet_sort, key=lambda x: x[1], reverse=True)
print(num_sort)

if you need to get slice:

print(num_sort[:3])
Vadim
  • 402
  • 6
  • 15
0
s = "BBBAAACCD"    
p = [(i,s.count(i)) for i in sorted(set(s))]

**If you are okay with not using the Counter.

Preetham
  • 577
  • 5
  • 13
-2
from collections import Counter
print(sorted(Counter('AAABBBCCD').most_common(3)))

This question seems to be a duplicate How to sort Counter by value? - python

Community
  • 1
  • 1
voidpro
  • 1,652
  • 13
  • 27
  • 1
    this doesn't work. If you make 'A' not the most common, it undoes the most_common and returns it in alphabetical order. – AgentBawls Sep 26 '17 at 20:49