0

I have a dataframe as below. I want to combine 4 columns in order and have new df with average of it's values. Please see for detail.

a = np.random.randint(5, size=(2, 24))
df = pd.DataFrame(a,index=['alpha','bravo'])    

df:
        0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15  16  17  18  19  20  21  22  23
alpha   3   0   0   4   4   0   2   2   0   4   2   0   3   4   3   4   3   2   2   3   4   0   0   0
bravo   4   4   4   4   5   2   2   1   4   2   4   0   1   4   3   2   2   3   1   0   3   1   0   0   

ideal output is below

df_new:
        0   4   8   12  16  20 
alpha   3.5 2   3   3.5 2.5 1   
bravo   4   2.5 2.5 2.5 1.5 1   

So bottom line is that I am combining 4 columns each (in order) and giving average of its values. Is there a pythonic way to do this instead of writing '+' signs of every columns? My actual df is much larger, so I wish there is some other way. Thanks!

EJ Kang
  • 455
  • 2
  • 5
  • 17
  • By the way, column names is going to be renamed, so it does not have to be in order of 0,4,8,12,16,20 as long as they are combined in order. Thanks! – EJ Kang Jan 17 '18 at 08:50

1 Answers1

3

You need floor divide array created by np.arange with groupby and aggregate mean:

a = np.arange(0, len(df.columns)) // 4 * 4
print (a)
[ 0  0  0  0  4  4  4  4  8  8  8  8 12 12 12 12 16 16 16 16 20 20 20 20]

df = df.groupby(a, axis=1).mean()
print (df)
         0    4    8    12   16   20
alpha  1.75  2.0  1.5  3.5  2.5  1.0
bravo  4.00  2.5  2.5  2.5  1.5  1.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252