Pandas : dataframe cumsum , reset if other column is false

Question

I have a dataframe with 2 columns, the objective here is simple ; reset the df.cumsum() if a row column is set to False;

df

      value      condition
0       1            1
1       2            1
2       3            1
3       4            0
4       5            1

the wanted result is as follows :

df

      value      condition
0       1            1
1       3            1
2       6            1
3       4            0
4       9            1

If i loop over the dataframe as described in this post Python pandas cumsum() reset after hitting max i can achieve the wanted results, but i was looking for a more vectorized way using pandas standard functions

score 6 · Answer 1 · answered Jul 10 '18 at 19:52

6

How about:

df['cSum'] = df.groupby((df.condition == 0).cumsum()).value.cumsum()

Output:

   value  condition  cSum
0      1          1     1
1      2          1     3
2      3          1     6
3      4          0     4
4      5          1     9

You'll group consecutive rows together until you encounter a 0 in the condition column, and then you apply the cumsum within each group separately.

answered Jul 10 '18 at 19:52

ALollz

57,915
7
66
89

This is the exact problem I am trying to sovle, however the accepted solution raises an error in the current version of pandas. Do you know how to fix that ? – UGuntupalli Jun 08 '21 at 01:55

Pandas : dataframe cumsum , reset if other column is false

1 Answers1

Linked