18

I'd like to build a running sum over a pandas dataframe. I have something like:

10/10/2012:  50,  0
10/11/2012: -10, 90
10/12/2012: 100, -5

And I would like to get:

10/10/2012:  50,  0
10/11/2012:  40, 90
10/12/2012: 140, 85

So every cell should be the sum of itself and all previous cells, how should I do this without using a loop.

Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
leo
  • 3,677
  • 7
  • 34
  • 46
  • 7
    Hint - the normal name for "running sum" is "cumulative sum" - commonly shortened to `cumsum` - a quick search in the docs and you should be good to go :) – Jon Clements Dec 14 '12 at 12:54
  • 1
    Thanks @JonClements, that was what I did search for. I just couldn't find the term I was searching for. – leo Dec 14 '12 at 12:55
  • (Somewhat) related: http://stackoverflow.com/questions/12370349/reasoning-about-consecutive-data-points-without-using-iteration – codeape Dec 14 '12 at 13:24

1 Answers1

30

As @JonClements mentions, you can do this using the cumsum DataFrame method:

from pandas import DataFrame
df = DataFrame({0: {'10/10/2012': 50, '10/11/2012': -10, '10/12/2012': 100}, 1: {'10/10/2012': 0, '10/11/2012': 90, '10/12/2012': -5}})

In [3]: df
Out[3]: 
              0   1
10/10/2012   50   0
10/11/2012  -10  90
10/12/2012  100  -5

In [4]: df.cumsum()
Out[4]: 
              0   1
10/10/2012   50   0
10/11/2012   40  90
10/12/2012  140  85
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535