1

What is the fastest way to compute the number of occurrences of elements within a Pandas series?

My current fastest solution involves .groupby(columnname).size(). Is there anything faster within Pandas? E.g. I want something like the following:

In [42]: df = DataFrame(['a', 'b', 'a'])

In [43]: df.groupby(0).size()
Out[43]: 
0
a    2
b    1
dtype: int64
MRocklin
  • 55,641
  • 23
  • 163
  • 235
  • 3
    Worrying about optimizations on this level seems like a waste of time, but you could try `value_counts`: it should have less overhead. – DSM Apr 27 '14 at 00:26
  • 2
    possible duplicate of [what is the most efficient way of counting occurrences in pandas?](http://stackoverflow.com/questions/20076195/what-is-the-most-efficient-way-of-counting-occurrences-in-pandas) – Noah Apr 27 '14 at 22:12

1 Answers1

3

The value_counts() function in pandas does this exactly.

Use that function on the column you want. i.e.

df['column_i_want'].value_counts()
cwharland
  • 6,275
  • 3
  • 22
  • 29