So i have a dataframe from which I'm trying to compute the mean of some items before a certain date (depicted by '0'), I manage to groupby and get the mean and even add it back in to the original dataframe. However, it'll show 'NaN' for those rows where the date is '1'. Is there anyway to replace them according to the id?
I'll reproduce a simplified version of my code below:
date(before 2009 ==0; after is 1)
id date profit
1 1 10
1 0 15
1 0 20
2 0 25
2 0 30
2 1 35
#so first i filter (let's call this filtered dataframe - df_filter) to only show those where date ==0 then i get the mean of it and add a new column called 'affect' to original dataframe (df) by:
df['affect'] = df_filter.groupby('id')['profit'].transform('mean')
which gives me
id date profit affect
1 1 10 NaN
1 0 15 17.5
1 0 20 17.5
2 0 25 27.5
2 0 30 27.5
2 1 35 NaN
the NaN i presume is due to my filtering (since i need to get the average of the profit only before those dates) hence when it's added back it turns up NaN.
Is there a way to match so the average appears even if the (date==1)? i.e. ideally it should show the average (of date ==0) for each id regardless of date.
id date profit affect
1 1 10 17.5
1 0 15 17.5
1 0 20 17.5
2 0 25 27.5
2 0 30 27.5
2 1 35 27.5