4

Given a list of tuples, I'm looking to get the most frequently occurring tuple BUT if there are "joint winners" it should pick between them at random.

tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]

The above list should return either (1,2) or (3,4) at random for the above list

Juddling
  • 4,594
  • 8
  • 34
  • 40

6 Answers6

14

Use collections.Counter:

>>> collections.Counter([ (1,2), (3,4), (5,6), (1,2), (3,4) ]).most_common()[0]
((1, 2), 2)

This is O(n log(n)).

simonzack
  • 19,729
  • 13
  • 73
  • 118
  • actually if you need most frequent item, it's possible to do in O(n). You just count frequencies, get max and then get all items with highest frequencies. at least 4 people don't know that :) – Roman Pekar Sep 16 '13 at 12:47
  • It's not possible to do this in O(n), counting all frequencies by itself requires some kind of hash table. – simonzack Sep 16 '13 at 12:53
  • AFAIK Counter is implemented as a dictionary and setting element of dictionary is `O(1)` in average. In your answer `most_common()` method will do the sort of all elements so it would be most heavy part - O(n log(n)) – Roman Pekar Sep 16 '13 at 13:40
2

You can first use Counter to find the most repeated tuple. Then find the required tuples and finally randomize and get the first value.

from collections import Counter
import random

tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]
lst = Counter(tups).most_common()
highest_count = max([i[1] for i in lst])
values = [i[0] for i in lst if i[1] == highest_count]
random.shuffle(values)
print values[0]
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
Sudip Kafle
  • 4,286
  • 5
  • 36
  • 49
1

You can first sort the list to get the tuples sorted by frequency. After that a linear scan can get you the most frequent tuple from the list. Overall time O(nlogn)

>>> tups = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]
>>> 
>>> sorted(tups)
[(1, 2), (1, 2), (3, 4), (3, 4), (5, 6)]
Srikar Appalaraju
  • 71,928
  • 54
  • 216
  • 264
1

This one should do your task in o(n) time:

>>> from random import shuffle
>>> from collections import Counter
>>>
>>> tups = [(1,2), (3,4), (5,6), (1,2), (3,4)]
>>> c = Counter(tups)                            # count frequencies
>>> m = max(v for _, v in c.iteritems())         # get max frq
>>> r = [k for k, v in c.iteritems() if v == m]  # all items with highest frq
>>> shuffle(r)                                   # if you really need random - shuffle
>>> print r[0]
(3, 4)
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
0

Counting with collections.Counter and then random choice of the most common:

import collections
import random

lis = [ (1,2), (3,4), (5,6), (1,2), (3,4) ]  # Test data
cmn = collections.Counter(lis).most_common()  # Numbering based on occurrence
most = [e for e in cmn if (e[1] == cmn[0][1])]  # List of those most common
print(random.choice(most)[0])  # Print one of the most common at random
Morten Zilmer
  • 15,586
  • 3
  • 30
  • 49
0

Here is another example with no imports:

listAlphaLtrs = ['b','a','a','b','a','c','a','a','b','c','c','b','a','a','a']
dictFoundLtrs = {i:listAlphaLtrs.count(i) for i in listAlphaLtrs}
maxcnt = 0
theltr = 0
for ltr in dictFoundLtrs:
    ltrfound = ltr
    foundcnt = dictFoundLtrs[ltr]
    if foundcnt > maxcnt:
        maxcnt = foundcnt
        theltr = ltrfound
print('most: ' + theltr)

Sources:

https://stackoverflow.com/a/23240989/1447509

halfer
  • 19,824
  • 17
  • 99
  • 186
cssyphus
  • 37,875
  • 18
  • 96
  • 111