from what I've read from different answers on stackoverflow and other resources, when providing the .transform()
with a UDF, each column is passed one by one for each Group
But when i tried it myself, i saw a Dataframe being passed into the UDF
df = pd.Dataframe({'State':['Texas', 'Texas', 'Florida', 'Florida'],
'a':[4,5,1,3], 'b':[6,10,3,11]}
def inspect(x):
print(type(x))
df.groupby('State').transform(inspect)
# Output
# <class 'pandas.core.series.Series'>
# <class 'pandas.core.series.Series'>
# <class 'pandas.core.frame.DataFrame'>
# <class 'pandas.core.series.Series'>
# <class 'pandas.core.series.Series'>
the Dataframe passed to the inspect
happens to be the Dataframe of the first group (State = Florida). But no one has mentioned and talked about a Dataframe being passed when working with UDFs while using .transform()
.
my question is :
- Why is a Dataframe passed to the
inspect
function when everyone says a Series (each column) is passed to the UDF? - why was the Dataframe of the first groupby object passed to the
inspect
? why wasn't the second groupby passed to theinspect
?