I have this dataframe in pandas:
day customer amount
0 1 cust1 500
1 2 cust2 100
2 1 cust1 50
3 2 cust1 100
4 2 cust2 250
5 6 cust1 20
For convenience:
df = pd.DataFrame({'day': [1, 2, 1, 2, 2, 6],
'customer': ['cust1', 'cust2', 'cust1', 'cust1', 'cust2', 'cust1'],
'amount': [500, 100, 50, 100, 250, 20]})
I would like to create a new column 'amount2days' so as to aggragate amounts per customer for the last two days, to get the following dataframe:
day customer amount amount2days ----------------------------
0 1 cust1 500 500 (no past transactions)
1 2 cust2 100 100 (no past transactions)
2 1 cust1 50 550 (500 + 50 = rows 0,2
3 2 cust1 100 650 (500 + 50 + 100, rows 0,2,3)
4 2 cust2 250 350 (100 + 250, rows 1,4)
5 6 cust1 20 20 (notice day is 6, and no day=5 for cust1)
i.e. I would like to perform the following (pseudo) code:
df['amount2days'] = df_of_past_2_days['amount'].sum()
for each row. What is the most convenient way to do so?
The summing I wish to peroform is over the day, but days does not necessarily have to increment in each new row, as shown in the example. Still I want to sum amounts over the past 2 days.