I have the following dataframe:
import pandas as pd
df = pd.DataFrame({'Value': [0, 1, 2,3, 4,5,6,7,8,9],'Name': ['John', 'John', 'John','John', 'John','John','John','John','John','John']
,'City': ['A', 'B', 'A','B', 'A','B','B','A','B','A'],'City2': ['C', 'D', 'C','D', 'C','D','D','C','D','C']})
df
Value Name City City2
0 0 John A C
1 1 John B D
2 2 John A C
3 3 John B D
4 4 John A C
5 5 John B D
6 6 John B D
7 7 John A C
8 8 John B D
9 9 John A C
I am trying to take the average of values when City2
equald 'C' but apply it to the whole new column:
I have tried:
df['C_Average'] = df[df['City2'] == 'C'].groupby(['Name','City'])['Value'].transform(lambda v: v.nsmallest(11).mean())
df
Value Name City City2 C_Average
0 0 John A C 4.4
1 1 John B D NaN
2 2 John A C 4.4
3 3 John B D NaN
4 4 John A C 4.4
5 5 John B D NaN
6 6 John B D NaN
7 7 John A C 4.4
8 8 John B D NaN
9 9 John A C 4.4
As you can see the new column is added, but I would like to apply it to the whole column and not just the rows where City2
equals C. IE the whole column to show 4.4. Any ideas?
Thanks!