I am new to pandas and numpy. Currently I want to caculate the weighted mean within the group. The codes searched from the internet work well for me.
import pandas as pd
import numpy as np
df = pd.DataFrame({'id':[0]*3+[1]*3,'se':[1]*2+[2]*2+[3]*2,'y':np.random.randn(6),'x':np.random.randn(6)})
def wavg(group, avg_name, weight_name):
d = group[avg_name]
w = group[weight_name]
try:
return (d * w).sum() / w.sum()
except ZeroDivisionError:
return np.nan
cc=df.groupby(['id','se']).apply(wavg, 'x','y').reset_index().rename(columns={0: 'retx'})
However, I am confused about how to build an apply function like the 'wavg'. what's the group in the wavg function. Can anyone explain it in detail?
Thanks in advance.