0

I saw this code on a yt video about statistics. I understand everything but the k for k portion

def mode(*args):
    dict_values = {i: args.count(i) for i in args}
    max_list = [k for k, v in dict_values.items() if v == max(dict_values.values())]
    return max_list
  • 1
    http://stackoverflow.com/questions/6475314/python-for-in-loop-preceded-by-a-variable – jonrsharpe Aug 09 '21 at 20:20
  • 3
    `k for k, v in dict_values.items()` means get `k, v` values out of `dict_values.items()` and build a list out of the `k`s. – Samwise Aug 09 '21 at 20:20
  • 2
    The term you need to search for is `list comprehension`. – quamrana Aug 09 '21 at 20:21
  • Convert the list comprehension to a `for` loop and I think you'll get it. – Barmar Aug 09 '21 at 20:23
  • `for k, v in dict_values.items(): ...` – Barmar Aug 09 '21 at 20:24
  • The key in reading this is to realize `k for k` is not a clause. To help you parse this in your brain, here are the clauses: `[ (k)... (for k,v in dict_values.items())...(if v == max_dict.values.values())) ]`. So, "for all the keys and values in dict", "if the value is equal to the maximum value", "return the key". – Tim Roberts Aug 09 '21 at 20:27
  • Just so you know, this code is *terrible*. Using unnecessarily inefficient algorithms through sheer laziness – juanpa.arrivillaga Aug 09 '21 at 20:35

1 Answers1

1

It's not really k for k, maybe writing the equivalent code without the list comprehension helps:

max_list = []

for k, v in dict_values.items():
    if v == max(dict_values.values()):
        max_list.append(k)

So, we create a list, we iterate over the key / value pairs in the dict and store the key if the value is equal to the max value in the dict.

Note that this code is very in efficient, since it computes the max in each iteration again. You should only compute it once. The counting is also very inefficient, since it counts each value again, not each unique value.

It is the worst implementation of this I can imagine.

How about this:

from collections import Counter

def mode(*args):
    counter = Counter(args)
    _, max_count = counter.most_common(1)[0]
    max_list = []
    for elem, count in counter.most_common():
        if count < max_count:
             break
        max_list.append(elem)
    return max_list

Drawing 2000 random ascii characters and comparing:

In [5]: elements = random.choices(string.ascii_letters, k=2000)

In [6]: %timeit mode(*elements)
67.8 µs ± 103 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [7]: %timeit mode_horrible(*elements)
43.4 ms ± 96.6 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

So the code you posted is almost 1000x slower and it only gets worse with more elements.

MaxNoe
  • 14,470
  • 3
  • 41
  • 46