I'm looking at the John Hopkins Dataset.
I have transformed it to the form (dummy data).
Country/Region | Province/State | Date | Type | Cases |
US Arizona 2020/03/14 Confirmed 100
Country/Region | Province/State | Date | Type | Cases |
US Arizona 2020/03/15 Confirmed 120
What I want is to calculate the difference between date n and date n-1 for each country,region and case type.
Something like
df['Difference'] = df.groupby(['Country/Region','Province/State','Type']).apply(...)
But I am not sure how to write the apply function.
I want the output table to look like this.
Country/Region | Province/State | Date | Type | Cases | Difference
US Arizona 2020/03/14 Confirmed 100 ...
Country/Region | Province/State | Date | Type | Cases |
US Arizona 2020/03/15 Confirmed 120 20
How is this achieved?