How do I summarizes a column based on a dynamic value in the existing row? Using the example below, I'd like to iterate over each row (x
), calculate the sum of all Clicks
where Date == x.Date_Yesterday
, and add the total as a new column.
Input data:
df = pd.DataFrame({
'Date': ['2021-09-14','2021-09-14','2021-09-14','2021-09-13','2021-09-12','2021-09-12','2021-09-11'],
'Date_Yesterday': ['2021-09-13','2021-09-13','2021-09-13','2021-09-12','2021-09-11','2021-09-11','2021-09-10'],
'Clicks': [100,100,100,50,10,10,1]
})
Date Date_Yesterday Clicks
0 2021-09-14 2021-09-13 100
1 2021-09-14 2021-09-13 100
2 2021-09-14 2021-09-13 100
3 2021-09-13 2021-09-12 50
4 2021-09-12 2021-09-11 10
5 2021-09-12 2021-09-11 10
6 2021-09-11 2021-09-10 1
Desired Output data:
Date Date_Yesterday Clicks Total_Clicks_Yesterday
2021-09-14 2021-09-13 100 50
2021-09-14 2021-09-13 100 50
2021-09-14 2021-09-13 100 50
2021-09-13 2021-09-12 50 20
2021-09-12 2021-09-11 10 1
2021-09-12 2021-09-11 10 1
2021-09-11 2021-09-10 1 N/A
Calculating the Total_Clicks_Yesterday
is simple with a static value:
clicks_yesterday = df['Total_Clicks_Yesterday'] = df.loc[df['Date'] == '2021-09-13', 'Clicks'].sum()
print(clicks_yesterday)
Date Date_Yesterday Clicks Total_Clicks_Yesterday
0 2021-09-14 2021-09-13 100 50
1 2021-09-14 2021-09-13 100 50
2 2021-09-14 2021-09-13 100 50
3 2021-09-13 2021-09-12 50 50
4 2021-09-12 2021-09-11 10 50
5 2021-09-12 2021-09-11 10 50
6 2021-09-11 2021-09-10 1 50
but I'm not sure how to make it dynamic for each line item?