I have the following dataframe
I want to reformat it in the following way:
- Group by name/account/monthly periods
- Average (mean) balance to two decimal places
- Average (mean) for transactions to no decimal places
- Count of days where balance < 0
- Count of days where Balance > Max credit
So the I apply the following function to make a Series of all the aggregations, and use the Series index as labels for the new columns:
def f(x):
d = {}
d['Avg_Transactions'] = x['Transaction'].mean().round()
d['Avg_Balance'] = x['Balance'].mean().round(2)
d['Zero_Balance_days'] = (x['Balance'] < 0).count()
d['Over_Credit_days'] = (x['Balance'] > x['Max Credit']).count()
return pd.Series(d, index=['Avg_Transactions', 'Avg_Balance', 'Zero_Balance_days', 'Over_Credit_days'])
month = df.groupby(['Account','Name', 'Month']).apply(f)
Which gives the following output:
But I am getting the conditional counts wrong and instead counting all days. What would be the proper way to write these?