pandas group by mean and add back into dataframe on another index

Question

So i have a dataframe from which I'm trying to compute the mean of some items before a certain date (depicted by '0'), I manage to groupby and get the mean and even add it back in to the original dataframe. However, it'll show 'NaN' for those rows where the date is '1'. Is there anyway to replace them according to the id?

I'll reproduce a simplified version of my code below:

date(before 2009 ==0; after is 1)
id date profit 
1   1    10
1   0    15
1   0    20
2   0    25
2   0    30
2   1    35

#so first i filter (let's call this filtered dataframe - df_filter) to only show those where date ==0 then i get the mean of it and add a new column called 'affect' to original dataframe (df) by:

df['affect'] = df_filter.groupby('id')['profit'].transform('mean')

which gives me 

id date profit affect
1   1    10      NaN
1   0    15      17.5
1   0    20      17.5
2   0    25      27.5
2   0    30      27.5
2   1    35      NaN

the NaN i presume is due to my filtering (since i need to get the average of the profit only before those dates) hence when it's added back it turns up NaN.

Is there a way to match so the average appears even if the (date==1)? i.e. ideally it should show the average (of date ==0) for each id regardless of date.

id date profit affect
1   1    10      17.5
1   0    15      17.5
1   0    20      17.5
2   0    25      27.5
2   0    30      27.5
2   1    35      27.5

score 0 · Accepted Answer · answered Oct 15 '21 at 09:59

0

You can use:

df['affect'] = df['affect'].bfill().ffill()

answered Oct 15 '21 at 09:59

Muhammad Hassan

4,079
1
13
27

pandas group by mean and add back into dataframe on another index

1 Answers1