Get most repeated name in a list, or first alphabetically if there is a tie

Question

I'm trying to get the most repeated name on a list, if there is a tie, return the one the occurs first alphabetically.

I have the following list:

names = ['sam','sam','leo','leo','john','jane','jane']

For this list it should return jane, as there is two ties with other names but its the first one alphabetically.

I have the following code in python.

def get_count(lst):
    lst.sort()
    d = {}
    for item in lst:
        if item not in d:
            d[item] = [1]
        else:
            d[item].append(1)
    def get_count_child(d):
        fd = {}
        for key, value in d.items():
            fd[key] = sum(value)
        return fd
    return get_count_child(d)

It outputs

{'jane': 2, 'john': 1, 'leo': 2, 'sam': 2}

Is there a way to extract the value from jane with the constraints that I mentioned above?

Convert the dictionary to a sequence of `(key, value)` tuples and sort it with an appropriate key function. — Michael Butscher, Mar 14 '20 at 05:36

score 4 · Answer 1 · answered Mar 14 '20 at 05:34

4

Say, d is your dictionary. You want to sort its items in the order of decreasing values (counts) but increasing keys (names). The first sorted item in the list is the one that you want:

wanted = sorted(d.items(), key=lambda x: (-x[1], x[0]))[0]
# ('jane', 2)
wanted[0]
# 'jane'

Note the negation in the lambda function: it ensures that the smaller counts look "bigger" and are placed closer to the end.

answered Mar 14 '20 at 05:34

DYZ

55,249
10
64
93

Additionally, [`Counter`](https://docs.python.org/3/library/collections.html#counter-objects) can be used to reach to this dictionary `d` from `names`. – AKS Mar 14 '20 at 05:42
@AKS Sure. But the proposed solution uses only standard library functions. – DYZ Mar 14 '20 at 05:47
@DYZ `Counter` is in the standard library... or do you mean built-ins? – wjandrea Mar 14 '20 at 05:51
@wjandrea Sorry, I mean built-ins. – DYZ Mar 14 '20 at 05:53

zhanymkanov · Answer 2 · 2020-03-14T05:51:28.937

If you are using python 3.7+, you can just sort the names and dict will save the order of insertion.

from collection import Counter
names = sorted(['sam','sam','leo','leo','john','jane','jane'])
names_count = Counter(names)
names_count.most_common(1)

Otherwise, to guarantee the order with no dependency on python version, you can do the following

def get_names_count(lst):
    names_count = {}
    for item in sorted(lst):
        names_count[item] = names_count.get(item, 0) + 1

    return names_count

def get_most_common_name(names_count):
    most_common = sorted(names_count, key=lambda x: (-x[1], x[0]))
    return most_common[0]

Note that I replaced lst.sort() with sorted(lst), since it is a bad practice to modify global objects (python passes a pointer to the list, not its copy)

Also, there is no need to store their counts in the list, so you can immediately count the names with the default value of 0 per name

P.S. By the time I posted this, DYZ has already answered the question, so my code is just a refactoring of your get_names_count

score 1 · Answer 3 · 2020-03-15T18:35:13.053

Change your "get_count_child" function return value with this next(iter(sorted(fd)))

So it should be like following:

def get_count(lst):
    lst.sort()
    d = {}
    for item in lst:
        if item not in d:
            d[item] = [1]
        else:
            d[item].append(1)
    def get_count_child(d):
        fd = {}
        for key, value in d.items():
            fd[key] = sum(value)
        return fd[next(iter(sorted(fd)))]
    return get_count_child(d)

wjandrea · Answer 4 · 2020-03-14T06:01:07.997

Python 3.7+ or CPython 3.6: `Counter.most_common`

Use collections.Counter to do the counting on the sorted list, then use its most_common method to get the top item. Ties are broken by the first occurence, so that's why the list needs to be sorted.

from collections import Counter

c = Counter(sorted(names))
print(c.most_common(1))  # -> [('jane', 2)]
print(c.most_common(1)[0][0])  # -> jane

This is version-dependent cause it relies on the underlying dict to preserve insertion order. See Are dictionaries ordered in Python 3.6+?

If you're using an earlier version, you can still use Counter, but use DYZ's solution to do the sorting.

score 0 · Answer 5 · answered Mar 21 '20 at 21:45

0

Here is another way to get the same result using the statistics module:

from statistics import mode

def get_count(lst):
    lst.sort()
    return mode(lst)

answered Mar 21 '20 at 21:45

Joshua Hall

332
4
15

Get most repeated name in a list, or first alphabetically if there is a tie

5 Answers5

Python 3.7+ or CPython 3.6: Counter.most_common

Python 3.7+ or CPython 3.6: `Counter.most_common`