0

A similar question has been asked for cumsum and grouping but it didn't solve my case.

I have a financial balance sheet of a lot of years and need to sum all previous values by year.

This is my reproducible set:

df=pd.DataFrame(
{"Amount": [265.95,2250.00,-260.00,-2255.95,120],
"Year": [2018,2018,2018,2019,2019]})

The result I want is the following:

Year  Amount
2017  0
2018  2255.95
2019  120.00
2020  120.00

So actually in a loop going from the lowest year in my whole set to the highest year in my set.

...
df[df.Year<=2017].Amount.sum()
df[df.Year<=2018].Amount.sum()
df[df.Year<=2019].Amount.sum()
df[df.Year<=2020].Amount.sum()
...
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Philippe
  • 707
  • 1
  • 5
  • 9
  • Does this answer your question? [Pandas group-by and sum](https://stackoverflow.com/questions/39922986/pandas-group-by-and-sum) – Sociopath Jan 24 '20 at 09:49

1 Answers1

1

First step is aggregate sum, then use Series.cumsum and Series.reindex with forward filling missing values by all possible years, last replace first missing values to 0:

years = range(2017, 2021)
df1 = (df.groupby('Year')['Amount']
         .sum()
         .cumsum()
         .reindex(years, method='ffill')
         .fillna(0)
         .reset_index())
print (df1)
   Year   Amount
0  2017     0.00
1  2018  2255.95
2  2019   120.00
3  2020   120.00
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Great answer @jezrael ! Thanks a lot, you shared valuable insights in the use of sequences in functions as well ;-) Grazie Mille ! – Philippe Jan 27 '20 at 08:48