26

I have a simple dataframe which I would like to bin for every 3 rows.

It looks like this:

    col1
0      2
1      1
2      3
3      1
4      0

and I would like to turn it into this:

    col1
0      2
1    0.5

I have already posted a similar question here but I have no Idea how to port the solution to my current use case.

Can you help me out?

Many thanks!

Community
  • 1
  • 1
TheChymera
  • 17,004
  • 14
  • 56
  • 86

3 Answers3

51

In Python 2 use:

>>> df.groupby(df.index / 3).mean()
   col1
0   2.0
1   0.5
TankorSmash
  • 12,186
  • 6
  • 68
  • 106
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • 3
    such a simple and elegant solution! – Constantino Oct 26 '15 at 14:43
  • 18
    I get 0.000000 2, 0.333333 1, 0.666667 3, 1.000000 1, 1.333333 0 with the latest Python and Pandas version. Probably has to do with integer division. *Edit*: Yes, Python 3 users, use `df.index // 3` – sougonde Feb 24 '16 at 19:49
  • 1
    Is there an equivalent way to do this if your dataframe has a datetime index, and you were insisting on doing every `n` rows? – Seth May 15 '20 at 16:56
  • @Seth: You could reset the index. Not sure if you want to use every nth row. If so, use modulo (%) instead. – Anne Nov 08 '22 at 19:18
28

The answer from Roman Pekar was not working for me. I imagine that this is because of differences between Python2 and Python3. This worked for me in Python3:

>>> df.groupby(df.index // 3).mean()
   col1
0   2.0
1   0.5
ShadowUC
  • 724
  • 6
  • 19
ojunk
  • 879
  • 8
  • 21
4

For Python 2 (2.2+) users, who have "true division" enabled (e.g. by using from __future__ import division), you need to use the "//" operator for "floor division":

df.groupby(df.index // 3).mean()
mohaseeb
  • 389
  • 4
  • 8