0

df['total'] = (df.DR - df.CR).cumsum()

is giving:

...
    Name     DR       CR     total 
303   B3  46.80     0.00  46682.07                                                                                                         
304   B3  45.20     0.00  46727.27                                                                                                                                                                                                                
395  BS1   0.00    10.37  47905.31                                                                                                         
396  BS2   0.00    87.00  47818.31                                                                                                         
397    C   0.00   482.10  47336.21                                                                                                                                                                                                               
399    C  20.00     0.00  47356.21

However I would like the cumsum to "restart" whenever the "Name" column (B3,BS1,C) changes to a different value.

So the desired result is:

    Name     DR       CR     total 
303   B3  46.80     0.00     46.80                                                                                                         
304   B3  45.20     0.00      1.60                                                                                                                                                                                                                
395  BS1   0.00    10.37    -10.37                                                                                                         
396  BS2   0.00    87.00    -97.37                                                                                                         
397    C   0.00   482.10   -482.10                                                                                                                                                                                                               
399    C  20.00     0.00   -462.10 

I am new to pandas. Thanks for your help

I have tried things like, but not working: df['total'] = df.groupby('GL')[(df.DR - df.CR)].cumsum()

tedioustortoise
  • 259
  • 3
  • 20
  • Does this answer your question? [Cumsum Reset based on a condition in Pandas](https://stackoverflow.com/questions/53079313/cumsum-reset-based-on-a-condition-in-pandas) – rpanai Jul 08 '20 at 19:17
  • 2
    `df.Name.ne(df.Name.shift()).cumsum()` would give you consecutive blocks on `Name` which you can groupby on. But I can't quite relate your `total` with your data. – Quang Hoang Jul 08 '20 at 19:22
  • This isn't working df['total'] = df.groupby('Name')['DR'-'CR'].cumsum() – tedioustortoise Jul 08 '20 at 19:31
  • or df['total'] = df.groupby('Name')[(df.DR - df.CR)].cumsum() is not working either – tedioustortoise Jul 08 '20 at 19:32
  • The result for the second row doesn't really make sense. 1.6 is not the commutative sum of 46.8 and 45.2. Could you please elaborate on that? – Roy2012 Jul 08 '20 at 19:40

1 Answers1

1

First variant - if you want to "join" all rows for each Name into a single group:

df['total'] = df.groupby('Name').apply(lambda grp:
    (grp.DR - grp.CR).cumsum()).reset_index(level=0, drop=True)

For your source data the result is:

    Name    DR      CR   total
303   B3  46.8    0.00   46.80
304   B3  45.2    0.00   92.00
395  BS1   0.0   10.37  -10.37
396  BS1   0.0   87.00  -97.37
397    C   0.0  482.10 -482.10
399    C  20.0    0.00 -462.10

Second variant - if any change in Name is to start a new group.

Assume that your DataFrame contains:

    Name    DR      CR
303   B3  46.8    0.00
304   B3  45.2    0.00
395  BS1   0.0   10.37
396  BS1   0.0   87.00
397    C   0.0  482.10
399    C  20.0    0.00
400   B3  53.0    8.00
401   B3  40.8    6.15

and the second B3 group is to be summed separately from the first group:

df['total'] = df.groupby((df.Name != df.Name.shift()).cumsum())\
    .apply(lambda grp: (grp.DR - grp.CR).cumsum()).reset_index(level=0, drop=True)

getting:

    Name    DR      CR   total
303   B3  46.8    0.00   46.80
304   B3  45.2    0.00   92.00
395  BS1   0.0   10.37  -10.37
396  BS1   0.0   87.00  -97.37
397    C   0.0  482.10 -482.10
399    C  20.0    0.00 -462.10
400   B3  53.0    8.00   45.00
401   B3  40.8    6.15   79.65

As you can see, the second B3 group is summed separately.

Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41