Here is the input dataframe:
id val
0 A 1
1 B 2
2 A -3
3 C 1
4 D 5
5 B 6
6 C -2
I would like to group entries by id, and then calculate a running sum of the most recent members of each group seen up to this point. Here is how the desired output would look like, with explanations how it is obtained:
id val out
0 A 1 1
1 B 2 3 (2 + 1)
2 A -3 -1 (-3 + 2)
3 C 1 0 (1+ -3 +2)
4 D 5 5 (5 + 1 + -3 + 2_
5 B 6 9 (6 + 5 + 1 + -3)
6 C -2 6 (-2 + 6 + 5 -3)
Here are some more detailed explanations: 1) The row with id=1 has 3=2+1, because at that time you have 2 groups, As and Bs, each with 1 row, so you have to take that single row from each group.
2) The row with id=2 has -1=-3+2 because at that time, you have 2 groups, As and Bs. The most recent row from the As is 2 A -3
and the single (and thus most recent) row from Bs is 1 B 2
, so you add these 2 rows.
3) In the row with id=6, you add up
2 A -3
4 D 5
5 B 6
6 C -2
You are taking 1 row from each group, and that is the row that is most recent at that point.