I have a 2D list:
arr = [['Mohit', 'shini','Manoj','Mot'],
['Mohit', 'shini','Manoj'],
['Mohit', 'Vis', 'Nusrath']]
I want to find the most frequent element in the 2D list. In the above example, the most common string is 'Mohit'
.
I know I can use brute force using two for loops and a dictionary to do this, but is there a more efficient way using numpy or any other library?
The nested lists could be of different lengths
Can someone also add the time of their methods? To find the fasted method. Also the caveats at which it might not be very efficient.
Edit
These are the timings of different methods on my system:
#timegb
%%timeit
collections.Counter(chain.from_iterable(arr)).most_common(1)[0][0]
5.91 µs ± 115 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#Kevin Fang and Curious Mind
%%timeit
flat_list = [item for sublist in arr for item in sublist]
collections.Counter(flat_list).most_common(1)[0]
6.42 µs ± 501 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
c = collections.Counter(item for sublist in arr for item in sublist).most_common(1)c[0][0]
6.79 µs ± 449 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#Mayank Porwal
def most_common(lst):
return max(set(lst), key=lst.count)
%%timeit
ls = list(chain.from_iterable(arr))
most_common(ls)
2.33 µs ± 42.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
#U9-Forward
%%timeit
l=[x for i in arr for x in i]
max(l,key=l.count)
2.6 µs ± 68.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Mayank Porwal's method runs the fastest on my system.