10

I am trying to use counter to sort letters by occurrence, and put any that have the same frequency into alphabetical order, but I can't get access to the Value of the dictionary that it produces.

letter_count = collections.Counter("alphabet")
print(letter_count)

produces:

Counter({'a': 2, 'l': 1, 't': 1, 'p': 1, 'h': 1, 'e': 1, 'b': 1})

How can I get it ordered by frequency, then by alphabetical order, so everything that shows up only once is in alphabetical order?

Arya McCarthy
  • 8,554
  • 4
  • 34
  • 56
iFunction
  • 1,208
  • 5
  • 21
  • 35

4 Answers4

15

It sounds like your question is how to sort the entire list by frequency, then break ties alphabetically. You can sort the entire list like this:

>>> a = sorted(letter_count.items(), key=lambda item: (-item[1], item[0]))
>>> print(a)
# [('a', 2), ('b', 1), ('e', 1), ('h', 1), ('l', 1), ('p', 1), ('t', 1)]

If you want the output to be a dict still, you can convert it into a collections.OrderedDict:

>>> collections.OrderedDict(a)
# OrderedDict([('a', 2),
#              ('b', 1),
#              ('e', 1),
#              ('h', 1),
#              ('l', 1),
#              ('p', 1),
#              ('t', 1)])

This preserves the ordering, as you can see. 'a' is first because it's most frequent. Everything else is sorted alphabetically.

Arya McCarthy
  • 8,554
  • 4
  • 34
  • 56
  • 2
    As I understand it, that was said because the entry that occurs twice had already been put at the front of the list; the OP wanted the ties between the hapaxes broken alphabetically. – Arya McCarthy May 19 '17 at 18:06
  • That's right, that's exactly what I was trying to do. Thank you. – iFunction May 19 '17 at 18:52
  • `lambda item: (-item[1], item[0]` How does this exactly work? – kirti purohit Oct 22 '21 at 08:17
  • @kirtipurohit, minus sign for using **reverse** sorting (par ```reverse``` was not used). For pair ```(item[1], item[0])``` sorting first sorts on ```item[1]``` and then, if items are equal on ```item[1]```, sorts on ```item[0]```. ```item[1]``` here is frequency, meanwhile ```item[0]``` is a char symbol – taciturno Oct 30 '21 at 13:33
5

You can sort the input before passing it to the counter.

>>> Counter(sorted("alphabet")).most_common()
[('a', 2), ('b', 1), ('e', 1), ('h', 1), ('l', 1), ('p', 1), ('t', 1)]
sabacherli
  • 180
  • 1
  • 6
  • 1
    Note that this is only guaranteed to work in Python 3.7 (released June 2018) or higher, when dictionaries became sorted by default. – Arya McCarthy Feb 21 '21 at 17:43
0

You can try this:

letter_count = collections.Counter("alphabet")

the_letters = [a for a, b in letter_count.items() if b == 1]
letters.sort()
print("letters that occur only once:")

for i in the_letters:
     print(i)

This code creates a list of all letters that occur only once by using list comprehension, and then prints them all. items() returns a key-value pair, which can be used to determine if the value of a key is equal to one.

Ajax1234
  • 69,937
  • 8
  • 61
  • 102
0

For the sake of completeness, to get the single-occurrence letters in alphabetical order:

letter_count = collections.Counter("alphabet")

single_occurrences = sorted([letter for letter, occurrence in letter_count.items() if occurrence == 1])
print(single_occurrences)
# prints: ['b', 'e', 'h', 'l', 'p', 't']
zwer
  • 24,943
  • 3
  • 48
  • 66