From the source code of the reference implementation, in the case where an integer is passed to Counter.most_common
, the result is calculated like so:
return heapq.nlargest(n, self.items(), key=_itemgetter(1))
using the standard library heapq
, and where _itemgetter
is
from operator import itemgetter as _itemgetter
The .items
of a Counter
are, of course, the key-value pairs as 2-tuples, stored in a dict_items
(since Counter
subclasses the built-in dict
). The key
function passed to heapq.nlargest
tells the algorithm how to order the elements: according to the value (i.e., element count). (This algorithm is used because it's faster than sorting all the items.)
So, we can simply emulate this logic, passing our own key. The key should sort Counter
items by value (count) "forwards", then by key (element) "backwards".
Since the elements in the original list are numeric, we can easily represent that:
import heapq
from collections import Counter
def smallest_most_common(seq):
return heapq.nlargest(1, Counter(seq).items(), key=lambda i:(i[1], -i[0]))
Testing it:
>>> smallest_most_common([1, 3, 2, 2, 3])
[(2, 2)]
>>> smallest_most_common([1, 2, 3, 2, 3])
[(2, 2)]
However, this breaks for non-numeric keys, because they can't be negated:
>>> # two t's, two c's; the t shows up first but c is "smaller"
>>> smallest_most_common('witchcraft')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in smallest_most_common
File "/usr/lib/python3.8/heapq.py", line 531, in nlargest
result = max(it, default=sentinel, key=key)
File "<stdin>", line 2, in <lambda>
TypeError: bad operand type for unary -: 'str'
However, the element counts will always be numeric. So, a simple trick is to switch to using heapq.nsmallest
, and negate the count rather than the elements:
import heapq
from collections import Counter
def smallest_most_common(seq):
return heapq.nsmallest(1, Counter(seq).items(), key=lambda i:(-i[1], i[0]))
(This is a common trick used for sorting.)
Now everything works:
>>> smallest_most_common([1, 3, 2, 2, 3])
[(2, 2)]
>>> smallest_most_common([1, 2, 3, 2, 3])
[(2, 2)]
>>> smallest_most_common('witchcraft')
[('c', 2)]
>>> smallest_most_common('craftwitch')
[('c', 2)]