Is it possible to do a groupby transform with custom functions?
data = {
'a':['a1','a2','a3','a4','a5'],
'b':['b1','b1','b2','b2','b1'],
'c':[55,44.2,33.3,-66.5,0],
'd':[10,100,1000,10000,100000],
}
import pandas as pd
df = pd.DataFrame.from_dict(data)
df['e'] = df.groupby(['b'])['c'].transform(sum) #this works as expected
print (df)
# a b c d e
#0 a1 b1 55.0 10 99.2
#1 a2 b1 44.2 100 99.2
#2 a3 b2 33.3 1000 -33.2
#3 a4 b2 -66.5 10000 -33.2
#4 a5 b1 0.0 100000 99.2
def custom_calc(x, y):
return (x * y)
#obviously wrong code here
df['e'] = df.groupby(['b'])['c'].transform(custom_calc(df['c'], df['d']))
As we can see from the above example, what I want is to explore the possibility of being able to pass in a custom function into .transform()
.
I am aware that .apply()
exists, but I want to find out if it is possible to use .transform()
exclusively.
More importantly, I want to understand how to formulate a proper function that can be passed into .transform()
for it to apply correctly.
P.S. Currently, I know default functions like 'count'
, sum
, 'sum'
, etc works.