I have dataframe that I am trying to group by which looks like this
Cust_ID Store_ID month lst_buy_dt1 purchase_amt
1 20 10 2015-10-07 100
1 20 10 2015-10-09 200
1 20 10 2015-10-20 100
I need the maximum of ls_buy_dt
and maximum or purchase amount for each cust_ID
, Store_ID
combination for each month in a different dataframe. Sample ouput:
Cust_ID Stored_ID month max_lst_buy_dt tot_purchase_amt
1 20 10 2015-10-20 400
My code is below .
aggregations = {
'lst_buy_dt1': { # Get the max purchase date across all purchases in a month
'max_lst_buy_dt': 'max',
},
'purchase_amt': { # Sum the purchases
'tot_purchase': 'sum', # Find the max, call the result "max_date"
}
}
grouped_at_Cust=metro_sales.groupby(['cust_id','store_id','month']).agg(aggregations).reset_index()
I am able to get the right aggregations . However the data frame contains an additional index in columns which I am not able to get rid of. Unable to show it, but here is the result from
list(grouped_at_Cust.columns.values)
[('cust_id', ''),
('store_id', ''),
('month', ''),
('lst_buy_dt1', 'max_lst_buy_dt'),
('purchase_amt', 'tot_purchase')]
Notice the hierarchy in the last 2 columns. How to get rid of it? I just need the columns max_lst_buy_dt
and tot_purchase
.