Running sum in pandas (without loop)

Question

I'd like to build a running sum over a pandas dataframe. I have something like:

10/10/2012:  50,  0
10/11/2012: -10, 90
10/12/2012: 100, -5

And I would like to get:

10/10/2012:  50,  0
10/11/2012:  40, 90
10/12/2012: 140, 85

So every cell should be the sum of itself and all previous cells, how should I do this without using a loop.

Hint - the normal name for "running sum" is "cumulative sum" - commonly shortened to `cumsum` - a quick search in the docs and you should be good to go :) — Jon Clements, Dec 14 '12 at 12:54
Thanks @JonClements, that was what I did search for. I just couldn't find the term I was searching for. — leo, Dec 14 '12 at 12:55
(Somewhat) related: http://stackoverflow.com/questions/12370349/reasoning-about-consecutive-data-points-without-using-iteration — codeape, Dec 14 '12 at 13:24

score 30 · Accepted Answer · answered Dec 14 '12 at 13:25

30

As @JonClements mentions, you can do this using the cumsum DataFrame method:

from pandas import DataFrame
df = DataFrame({0: {'10/10/2012': 50, '10/11/2012': -10, '10/12/2012': 100}, 1: {'10/10/2012': 0, '10/11/2012': 90, '10/12/2012': -5}})

In [3]: df
Out[3]: 
              0   1
10/10/2012   50   0
10/11/2012  -10  90
10/12/2012  100  -5

In [4]: df.cumsum()
Out[4]: 
              0   1
10/10/2012   50   0
10/11/2012   40  90
10/12/2012  140  85

answered Dec 14 '12 at 13:25

Andy Hayden

359,921
101
625
535

For some reason, this did not work for my case. I had to do: df['XYX'] = df['XYZ'].cumsum() – Lokesh A. R. Sep 29 '13 at 15:27
@user1815357 very strange! Do you mind posting an example as an issue on github (perhaps it's a bug) https://github.com/pydata/pandas/issues?direction=desc&sort=updated&state=open – Andy Hayden Sep 29 '13 at 17:15
Sure. Will do in few hours. – Lokesh A. R. Sep 30 '13 at 12:18

Running sum in pandas (without loop)

1 Answers1

Linked

Related