6

I have this data frame as example:

Col1       Col2       Col3       Col4
   1          2          3        2.2

I would like to to add a 4th column called 'Gmean' that calculate the geometric mean of the first 3 columns on each row.

How can get it done ?

Thanks!

piRSquared
  • 285,575
  • 57
  • 475
  • 624
datascana
  • 641
  • 2
  • 8
  • 16

2 Answers2

7

One way would be with Scipy's geometric mean function -

from scipy.stats.mstats import gmean

df['Gmean'] = gmean(df.iloc[:,:3],axis=1)

Another way with the formula of geometric mean itself -

df['Gmean'] = np.power(df.iloc[:,:3].prod(axis=1),1.0/3)

If there are exactly 3 columns, just use df instead of df.iloc[:,:3]. Also, if you are looking for performance, you might want to work with the underlying array data with df.values or df.iloc[:,:3].values.

Divakar
  • 218,885
  • 19
  • 262
  • 358
4
df.assign(Gmean=df.iloc[:, :3].prod(1) ** (1. / 3))

   Col1  Col2  Col3  Col4     Gmean
0     1     2     3   2.2  1.817121
piRSquared
  • 285,575
  • 57
  • 475
  • 624