Conditional counters in R

Question

I am working with a covid dataset, and I got to get a counter from the first day that the virus appeared in said country

This is an example of my data

And this is my desired result

I have been trying with this code:

data1<-data1%>% 
  arrange(country,Date) %>% 
  group_by(Country) %>% 
  mutate(Counter= Date-first(Date)+1)

But just gets me a counter from day 1, how can I get that day 1 is from the day that confirmed is 1 for the first time.

Here is the example data:

structure(list(Date = structure(c(1577836800, 1577923200, 1578009600, 
1578096000, 1578182400, 1578268800, 1578355200, 1578441600, 1577836800, 
1577923200, 1578009600, 1578096000, 1578182400, 1578268800, 1578355200, 
1578441600, 1577836800, 1577923200, 1578009600, 1578096000, 1578182400, 
1578268800, 1578355200, 1578441600), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), country = c("Afganistan", "Afganistan", "Afganistan", 
"Afganistan", "Afganistan", "Afganistan", "Afganistan", "Afganistan", 
"Colombia", "Colombia", "Colombia", "Colombia", "Colombia", "Colombia", 
"Colombia", "Colombia", "France", "France", "France", "France", 
"France", "France", "France", "France"), confirmed = c(0, 0, 
0, 0, 0, 1, 1, 2, 0, 0, 1, 1, 2, 3, 3, 3, 0, 0, 0, 0, 0, 1, 1, 
1)), row.names = c(NA, -24L), class = c("tbl_df", "tbl", "data.frame"
))

`ave(d$confirmed !=0, d$country, FUN = cumsum)` – Henrik Jul 07 '21 at 18:27 — Henrik, Jul 07 '21 at 18:27

score 1 · Accepted Answer · answered Jul 08 '21 at 01:20

To get the first Date within a country group where the number of confirmed cases if greater than 0, you can try Date[which(confirmed > 0)][1]. For Dates after that first confirmed date, you can calculate the counter taking the difference similar to what you had tried.

library(dplyr)

df %>%
  arrange(country, Date) %>%
  group_by(country) %>%
  mutate(first_confirmed = Date[which(confirmed > 0)][1],
         counter = ifelse(Date >= first_confirmed, Date - first_confirmed + 1, 0))

Output

   Date       country    confirmed first_confirmed counter
   <date>     <chr>          <dbl> <date>            <dbl>
 1 2020-01-01 Afganistan         0 2020-01-06            0
 2 2020-01-02 Afganistan         0 2020-01-06            0
 3 2020-01-03 Afganistan         0 2020-01-06            0
 4 2020-01-04 Afganistan         0 2020-01-06            0
 5 2020-01-05 Afganistan         0 2020-01-06            0
 6 2020-01-06 Afganistan         1 2020-01-06            1
 7 2020-01-07 Afganistan         1 2020-01-06            2
 8 2020-01-08 Afganistan         2 2020-01-06            3
 9 2020-01-01 Colombia           0 2020-01-03            0
10 2020-01-02 Colombia           0 2020-01-03            0
11 2020-01-03 Colombia           1 2020-01-03            1
12 2020-01-04 Colombia           1 2020-01-03            2
13 2020-01-05 Colombia           2 2020-01-03            3
14 2020-01-06 Colombia           3 2020-01-03            4
15 2020-01-07 Colombia           3 2020-01-03            5
16 2020-01-08 Colombia           3 2020-01-03            6
17 2020-01-01 France             0 2020-01-06            0
18 2020-01-02 France             0 2020-01-06            0
19 2020-01-03 France             0 2020-01-06            0
20 2020-01-04 France             0 2020-01-06            0
21 2020-01-05 France             0 2020-01-06            0
22 2020-01-06 France             1 2020-01-06            1
23 2020-01-07 France             1 2020-01-06            2
24 2020-01-08 France             1 2020-01-06            3

Data

df <- structure(list(Date = structure(c(18262, 18263, 18264, 18265, 
18266, 18267, 18268, 18269, 18262, 18263, 18264, 18265, 18266, 
18267, 18268, 18269, 18262, 18263, 18264, 18265, 18266, 18267, 
18268, 18269), class = "Date"), country = c("Afganistan", "Afganistan", 
"Afganistan", "Afganistan", "Afganistan", "Afganistan", "Afganistan", 
"Afganistan", "Colombia", "Colombia", "Colombia", "Colombia", 
"Colombia", "Colombia", "Colombia", "Colombia", "France", "France", 
"France", "France", "France", "France", "France", "France"), 
    confirmed = c(0, 0, 0, 0, 0, 1, 1, 2, 0, 0, 1, 1, 2, 3, 3, 
    3, 0, 0, 0, 0, 0, 1, 1, 1)), class = "data.frame", row.names = c(NA, 
-24L))

Conditional counters in R

1 Answers1