Is there a "freq" function in numpy/python?

Question

Suppose you have:

arr = np.array([1,2,1,3,3,4])

Is there a built in function that returns the most frequent element?

use `np.bincount` if all elements are integers. – nye17 Mar 31 '13 at 21:40 — nye17, Mar 31 '13 at 21:40

Raymond Hettinger · Accepted Answer · 2013-04-01T04:12:18.890

13

Yes, Python's collections.Counter has direct support for finding the most frequent elements:

>>> from collections import Counter

>>> Counter('abracadbra').most_common(2)
[('a', 4), ('r', 2)]

>>> Counter([1,2,1,3,3,4]).most_common(2)
[(1, 2), (3, 2)]

With numpy, you might want to start with the histogram() function or the bincount() function.

With scipy, you can search for the modal element with mstats.mode.

edited Apr 01 '13 at 04:12

answered Mar 31 '13 at 21:32

Raymond Hettinger

216,523
63
388
485

score 2 · Answer 2 · answered Mar 31 '13 at 21:48

the pandas module might also be of help here. pandas is a neat data analysis package for python and also has support for this problem.

import pandas as pd 
arr = np.array([1,2,1,3,3,4])
arr_df = pd.Series(arr) 
value_counts = arr_df.value_counts()
most_frequent = value_counts.max()

this returns

> most_frequent 
2

score 0 · Answer 3 · answered Mar 31 '13 at 22:12

This will work for any type, integer or not, and the return is always a numpy array:

def most_common(a, n=1) :
    if a.dtype.kind not in 'bui':
        items, _ = np.unique(a, return_inverse=True)
    else:
        items, _ = None, a
    counts = np.bincount(_)
    idx = np.argsort(counts)[::-1][:n]
    return idx.astype(a.dtype) if items is None else items[idx]

>>> a = np.fromiter('abracadabra', dtype='S1')
>>> most_common(a, 2)
array(['a', 'r'], 
      dtype='|S1')
>>> a = np.random.randint(10, size=100)
>>> a
array([0, 0, 0, 9, 3, 9, 1, 2, 6, 3, 0, 4, 3, 2, 4, 7, 2, 8, 8, 2, 9, 7, 0,
       3, 5, 2, 5, 0, 4, 2, 4, 7, 8, 5, 4, 0, 1, 6, 1, 0, 2, 0, 5, 1, 3, 8,
       8, 6, 3, 5, 4, 3, 3, 5, 0, 7, 3, 0, 2, 5, 4, 2, 4, 2, 8, 1, 4, 4, 7,
       4, 4, 3, 7, 4, 0, 1, 0, 8, 8, 1, 1, 2, 1, 4, 2, 5, 1, 0, 7, 2, 0, 0,
       0, 8, 9, 9, 8, 1, 3, 8])
>>> most_common(a, 5)
array([0, 4, 2, 8, 3])

Is there a "freq" function in numpy/python?

3 Answers3