Assuming you have only two columns, one for dates and one for cumulative cases, you can get the number of cases by substracting the cumulative of one day by the value of the day before.
In dplyr
, you can use lag
function for that:
Here a fake and reproducible dataset (I intentionally keep orogonal cases values that I provided to show the correct calculation)
df <- data.frame(date = seq(ymd("2020-01-01"),ymd("2020-01-10"),by = "day"),
cases = sample(10:100,10))
df$cumCase <- cumsum(df$cases)
library(dplyr)
df %>% mutate(Orig_cases = ifelse(row_number()==1, cumCase, cumCase - lag(cumCase)))
date cases cumCase Orig_cases
1 2020-01-01 88 88 88
2 2020-01-02 49 137 49
3 2020-01-03 14 151 14
4 2020-01-04 35 186 35
5 2020-01-05 67 253 67
6 2020-01-06 23 276 23
7 2020-01-07 95 371 95
8 2020-01-08 63 434 63
9 2020-01-09 17 451 17
10 2020-01-10 90 541 90
Now, you have the correct calculation, you can pass it to ggplot
by doing:
library(dplyr)
library(ggplot2)
df %>% mutate(Orig_cases = ifelse(row_number()==1, cumCase, cumCase - lag(cumCase)))# %>%
ggplot(aes(x = date, y = Orig_cases))+
geom_col()+
geom_line(aes(y = cumCase, group = 1))
