Python: Get most frequent item in list

Question

Given a list of tuples, I'm looking to get the most frequently occurring tuple BUT if there are "joint winners" it should pick between them at random.

tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]

The above list should return either (1,2) or (3,4) at random for the above list

simonzack · Answer 1 · 2013-09-16T12:52:36.973

14

Use collections.Counter:

>>> collections.Counter([ (1,2), (3,4), (5,6), (1,2), (3,4) ]).most_common()[0]
((1, 2), 2)

This is O(n log(n)).

edited Sep 16 '13 at 12:52

answered Sep 16 '13 at 12:25

simonzack

19,729
13
73
118

actually if you need most frequent item, it's possible to do in O(n). You just count frequencies, get max and then get all items with highest frequencies. at least 4 people don't know that :) – Roman Pekar Sep 16 '13 at 12:47
It's not possible to do this in O(n), counting all frequencies by itself requires some kind of hash table. – simonzack Sep 16 '13 at 12:53
AFAIK Counter is implemented as a dictionary and setting element of dictionary is `O(1)` in average. In your answer `most_common()` method will do the sort of all elements so it would be most heavy part - O(n log(n)) – Roman Pekar Sep 16 '13 at 13:40

score 2 · Accepted Answer · edited Sep 16 '13 at 12:48

2

You can first use Counter to find the most repeated tuple. Then find the required tuples and finally randomize and get the first value.

from collections import Counter
import random

tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]
lst = Counter(tups).most_common()
highest_count = max([i[1] for i in lst])
values = [i[0] for i in lst if i[1] == highest_count]
random.shuffle(values)
print values[0]

edited Sep 16 '13 at 12:48

Roman Pekar

107,110
28
195
197

answered Sep 16 '13 at 12:40

Sudip Kafle

4,286
5
36
49

@Juddling actually, you don't need most_common() - see my answer – Roman Pekar Sep 16 '13 at 12:50
@RomanPekar I see, why would I want to avoid using most_common()? – Juddling Sep 16 '13 at 12:52
If I'm not mistaken, most_common() will sort the whole list and you just need the maximum frequency – Roman Pekar Sep 16 '13 at 12:54

score 1 · Answer 3 · answered Sep 16 '13 at 12:24

You can first sort the list to get the tuples sorted by frequency. After that a linear scan can get you the most frequent tuple from the list. Overall time O(nlogn)

>>> tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]
>>> 
>>> sorted(tups)
[(1, 2), (1, 2), (3, 4), (3, 4), (5, 6)]

score 1 · Answer 4 · answered Sep 16 '13 at 12:45

1

This one should do your task in o(n) time:

>>> from random import shuffle
>>> from collections import Counter
>>>
>>> tups = [(1,2), (3,4), (5,6), (1,2), (3,4)]
>>> c = Counter(tups)                            # count frequencies
>>> m = max(v for _, v in c.iteritems())         # get max frq
>>> r = [k for k, v in c.iteritems() if v == m]  # all items with highest frq
>>> shuffle(r)                                   # if you really need random - shuffle
>>> print r[0]
(3, 4)

answered Sep 16 '13 at 12:45

Roman Pekar

107,110
28
195
197

1

This is not `O(n)`, think of the case where all elements are unqiue. – simonzack Sep 16 '13 at 12:57
https://wiki.python.org/moin/TimeComplexity setting of element in dictionary is O(1) average – Roman Pekar Sep 16 '13 at 13:37

score 0 · Answer 5 · answered Sep 16 '13 at 12:49

Counting with collections.Counter and then random choice of the most common:

import collections
import random

lis = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]  # Test data
cmn = collections.Counter(lis).most_common()  # Numbering based on occurrence
most = [e for e in cmn if (e[1] == cmn[0][1])]  # List of those most common
print(random.choice(most)[0])  # Print one of the most common at random

score 0 · Answer 6 · edited Oct 03 '20 at 15:30

Here is another example with no imports:

listAlphaLtrs = ['b','a','a','b','a','c','a','a','b','c','c','b','a','a','a']
dictFoundLtrs = {i:listAlphaLtrs.count(i) for i in listAlphaLtrs}
maxcnt = 0
theltr = 0
for ltr in dictFoundLtrs:
    ltrfound = ltr
    foundcnt = dictFoundLtrs[ltr]
    if foundcnt > maxcnt:
        maxcnt = foundcnt
        theltr = ltrfound
print('most: ' + theltr)

Sources:

https://stackoverflow.com/a/23240989/1447509

Python: Get most frequent item in list

6 Answers6

Related