I want to run some complex math while aggregating. I wrote the aggregation function:
import math as mt
# convert calc cols to float from object
cols = dfg_dom_smry.columns
cols = cols[2:]
for col in cols:
df[col] = df[col].astype(float)
# groupby fields
index = ['PA']
#aggregate
df = dfdom.groupby(index).agg({'newcol1': (mt.sqrt(sum('savings'*'rp')**2))/sum('savings')})
I got an error: TypeError: can't multiply sequence by non-int of type 'str'
This is an extract of my data. The full data has many set of savings and rp columns. So ideally I want to run a for loop for each set of savings and rp columns
PA domain savings rp
M M-RET-COM 383,895.36 0.14
P P-RET-AG 14,302,804.19 0.16
P P-RET-COM 56,074,119.28 0.33
P P-RET-IND 46,677,610.00 0.27
P P-SBD/NC-AG 1,411,905.00 -
P P-SBD/NC-COM 4,255,891.25 0.36
P P-SBD/NC-IND 295,365.00 -
S S-RET-AG 2,391,504.33 0.72
S S-RET-COM 19,195,073.84 0.18
S S-RET-IND 17,677,708.38 0.13
S S-SBD/NC-COM 6,116,407.07 0.05
D D-RET-COM 11,944,490.39 0.15
D D-RET-IND 1,213,117.63 -
D D-SBD/NC-COM 2,708,153.57 0.69
C C-RET-AG
C C-RET-COM
C C-RET-IND
For the above data this would be the final result:
PA newcol1
M 0.143027374757981
P 0.18601700701305
S 0.0979541706738756
D 0.166192684106493
C
thanks for your help