2

I am importing covid data from a web site into a pandas dataframe. The data included cases and deaths. I would like to add columns for cum_cases and cum_deaths. This is simple in a spread sheet and I would do it before importing the csv file with the data but there are too many rows to make it easy and besides I would like to know how. With a spread sheet I insert a 0 into the first cell of each the cumulative cases (cum_cases) and cumulative deaths (cum_deaths) columns then use formulas - for example if I was in cell d5 the formula would be d4+c5 assuming the two columns for cases or deaths were adjacent.

But I don't have a clue how to do the same thing in a dataframe and have spent several days searching duckduckgo and google for solutions to no avail.

What I have

|Date        |cases  |deaths  |
|:----------:|:-----:|:------:|
|2019-12-31  |  0    |  0     |
|2020-01-01  |  1    |  2     |
|2020-01-02  |  3    |  5     |

What I want


|Date        |cases  |deaths  |cum_cases| cum_deaths|
|:----------:|:-----:|:------:|:-------:|:---------:|
|2019-12-31  |  0    |  0     |0        | 0         |
|2020-01-01  |  1    |  2     |1        | 2         |
|2020-01-02  |  3    |  5     |4        | 7         |
|2020-01-03  |  1    |  2     |5        | 9         |

  • As stated above its a duplicate question. Adding this for completeness. Pandas have cumulative sum API called **cumsum** df["cun_cases"] = df["cases"].cumsum() If there are NaNs it will be skipped. – sam May 31 '20 at 10:25
  • Thanks I think I won't ask any more questions. Its just to difficult for a newbie. – John Doucette May 31 '20 at 12:21
  • No problem John.. it's totally okay to ask questions. That's why we have StackverFlow!! – sam May 31 '20 at 12:24
  • Thanks sam. You gave me the information I needed. Its guys like you that tack short comment with short answer after question is closed that will bring me back. – John Doucette May 31 '20 at 13:44

0 Answers0