Consider the Following DataFrame
candy = pd.DataFrame({'Name':['Bob','Bob','Bob','Annie','Annie','Annie','Daniel','Daniel','Daniel'], 'Candy': ['Chocolate', 'Chocolate', 'Lollies','Chocolate', 'Chocolate', 'Lollies','Chocolate', 'Chocolate', 'Lollies'], 'Value':[15,15,10,25,30,12,40,40,16]})
After reading the following post, I am aware that apply works on the whole Dataframe and transform works on a series.
Apply vs transform on a group object
So if I want to append the total $ spend on candy per person, I can simply use the following.
candy['Total Spend'] = candy.groupby(['Name'])['Value'].transform(sum)
But if I need to append the total $ chocolate spend per person - it feels like I have no choice but to create a separate dataframe and then merging it back by using the apply function since transform only works on a series.
chocolate = candy.groupby(['Name']).apply(lambda x: x[x['Candy'] == 'Chocolate']['Value'].sum()).reset_index(name = 'Total_Chocolate_Spend')
candy = pd.merge(candy, chocolate, how = 'left',left_on=['Name'], right_on=['Name'])
While I don't mind writing the above code to solve this problem. Is it possible to 'transform' the applied results back to the dataframe without having to create a separate dataframe and merging it?
What is actually happening when the transform function is used? Is a separate series being stored in memory and then merged back by the indexes similar to what I have done in the apply then merged method?