2

I have the following dataset :

date     sales
201201   5
201202   5
201203   5
201204   5
201205   5
201206   5
201207   5
201208   5
201209   5
201210   5
201211   5
201212   5
201301   100
201302   100

And I want to compute the cumulative sum of sales, from the beginning to the actual date + 12 months

So here :

date     sales   expected
201201   5       60
201202   5       160
201203   5       260
201204   5       260 
201205   5       260
201206   5       260
201207   5       260
201208   5       260
201209   5       260
201210   5       260
201211   5       260
201212   5       260
201301   100     260
201302   100     260

According to this question How to compute cumulative sum of previous N rows in pandas? I tried :

df['sales'].rolling(window=12).sum()

However I am looking for something more like this :

df['sales'].rolling(window=['unlimited preceding, 11 following']).sum()
Vincent
  • 1,534
  • 3
  • 20
  • 42
  • Sorry, to ask this - why second value is 160 and third value becomes 260? What's the logic behind it? – cph_sto Jan 21 '19 at 09:53
  • First expected value is sum(sales from beginning until 201301) = 60. Second expected value is sum(sales from beginning until 201302) = 160 as sales are 100 in 201301. Then third value is 260 because I add the 100 sales from 201302, etc – Vincent Jan 21 '19 at 09:56

1 Answers1

3

Use cumsum directly thanks shift by 11, than use ffill to fill NaNs with previous value:

df['expected'] = df['sales'].cumsum().shift(-11).ffill()

And now:

print(df)

Is:

      date  sales  expected
0   201201      5      60.0
1   201202      5     160.0
2   201203      5     260.0
3   201204      5     260.0
4   201205      5     260.0
5   201206      5     260.0
6   201207      5     260.0
7   201208      5     260.0
8   201209      5     260.0
9   201210      5     260.0
10  201211      5     260.0
11  201212      5     260.0
12  201301    100     260.0
13  201302    100     260.0
U13-Forward
  • 69,221
  • 14
  • 89
  • 114