Calculating Non-Cumulative into a new column from cumulative values (Python)

Question

My goal is to convert below - Current Data Output

By adding a new column called Cases which is non-cumulative for further analysis.

I have tried this solution but I am not sure what is my mistake -

Data["Cases"] = Data.groupby(level=0).diff().fillna(Data).reset_index()

Please advise

@EdChum answered a similar question but i couldn't adapt - https://stackoverflow.com/questions/36452024/python-pandas-unroll-remove-cumulative-sum — CHATSHD, Apr 06 '20 at 19:19
Also now trying Data['Cases'] = Data.groupby(level=0).diff().fillna('Cumulative_Cases').reset_index() — CHATSHD, Apr 06 '20 at 19:20

score 2 · Accepted Answer · answered Apr 06 '20 at 19:26

A simple approach will be to take the lag of values in cumulative column and then calculating the difference between cumulative and lag.

import pandas as pd

date = ['2020-03-01','2020-03-02','2020-03-03','2020-03-04','2020-03-05'] 
cases = [1, 2, 4, 7, 10]
data = list(zip(date, cases))

df = pd.DataFrame(data, columns=['Date','Cases'])
df['Lag'] = df.Cases.shift(1).fillna(0)
df['Daily Cases'] = df.Cases - df.Lag

print(df)
         Date  Cases  Lag  Daily Cases
0  2020-03-01      1  0.0          1.0
1  2020-03-02      2  1.0          1.0
2  2020-03-03      4  2.0          2.0
3  2020-03-04      7  4.0          3.0
4  2020-03-05     10  7.0          3.0

Just remember that order is important here. So sort your cumulative column in ascending order.

Don't forget to mark this as accepted answer if it helps you

Calculating Non-Cumulative into a new column from cumulative values (Python)

1 Answers1