3

I'm trying to get the most repeated name on a list, if there is a tie, return the one the occurs first alphabetically.

I have the following list:

names = ['sam','sam','leo','leo','john','jane','jane']

For this list it should return jane, as there is two ties with other names but its the first one alphabetically.

I have the following code in python.

def get_count(lst):
    lst.sort()
    d = {}
    for item in lst:
        if item not in d:
            d[item] = [1]
        else:
            d[item].append(1)
    def get_count_child(d):
        fd = {}
        for key, value in d.items():
            fd[key] = sum(value)
        return fd
    return get_count_child(d)

It outputs

{'jane': 2, 'john': 1, 'leo': 2, 'sam': 2}

Is there a way to extract the value from jane with the constraints that I mentioned above?

wjandrea
  • 28,235
  • 9
  • 60
  • 81
gmwill934
  • 609
  • 1
  • 10
  • 27

5 Answers5

4

Say, d is your dictionary. You want to sort its items in the order of decreasing values (counts) but increasing keys (names). The first sorted item in the list is the one that you want:

wanted = sorted(d.items(), key=lambda x: (-x[1], x[0]))[0]
# ('jane', 2)
wanted[0]
# 'jane'

Note the negation in the lambda function: it ensures that the smaller counts look "bigger" and are placed closer to the end.

DYZ
  • 55,249
  • 10
  • 64
  • 93
3

If you are using python 3.7+, you can just sort the names and dict will save the order of insertion.

from collection import Counter
names = sorted(['sam','sam','leo','leo','john','jane','jane'])
names_count = Counter(names)
names_count.most_common(1)

Otherwise, to guarantee the order with no dependency on python version, you can do the following

def get_names_count(lst):
    names_count = {}
    for item in sorted(lst):
        names_count[item] = names_count.get(item, 0) + 1

    return names_count

def get_most_common_name(names_count):
    most_common = sorted(names_count, key=lambda x: (-x[1], x[0]))
    return most_common[0]

Note that I replaced lst.sort() with sorted(lst), since it is a bad practice to modify global objects (python passes a pointer to the list, not its copy)

Also, there is no need to store their counts in the list, so you can immediately count the names with the default value of 0 per name

P.S. By the time I posted this, DYZ has already answered the question, so my code is just a refactoring of your get_names_count

zhanymkanov
  • 546
  • 7
  • 15
1

Change your "get_count_child" function return value with this next(iter(sorted(fd)))

So it should be like following:

def get_count(lst):
    lst.sort()
    d = {}
    for item in lst:
        if item not in d:
            d[item] = [1]
        else:
            d[item].append(1)
    def get_count_child(d):
        fd = {}
        for key, value in d.items():
            fd[key] = sum(value)
        return fd[next(iter(sorted(fd)))]
    return get_count_child(d)
0

Python 3.7+ or CPython 3.6: Counter.most_common

Use collections.Counter to do the counting on the sorted list, then use its most_common method to get the top item. Ties are broken by the first occurence, so that's why the list needs to be sorted.

from collections import Counter

c = Counter(sorted(names))
print(c.most_common(1))  # -> [('jane', 2)]
print(c.most_common(1)[0][0])  # -> jane

This is version-dependent cause it relies on the underlying dict to preserve insertion order. See Are dictionaries ordered in Python 3.6+?

If you're using an earlier version, you can still use Counter, but use DYZ's solution to do the sorting.

wjandrea
  • 28,235
  • 9
  • 60
  • 81
0

Here is another way to get the same result using the statistics module:

from statistics import mode

def get_count(lst):
    lst.sort()
    return mode(lst)
Joshua Hall
  • 332
  • 4
  • 15