I am using R to analyse a time series. My goal is to count from "response" the consecutive sequences. I want to add a column which classifies my data according to consecutive sequences in column response. Example: row 1 is group 1 for id "A", row 3 is group 2 for id "A", row 6 to 9 is group 3 for id "A". The result what I want is shown in "want_group". The data has the following structure:
"row" "date" "id" "response" "want_group"
1 2021-10-06 "A" 1 1
2 2021-10-07 "A" 0 0
3 2021-10-08 "A" 1 2
4 2021-10-09 "A" 0 0
5 2021-10-10 "A" 0 0
6 2021-10-11 "A" 1 3
7 2021-10-12 "A" 1 3
8 2021-10-13 "A" 1 3
9 2021-10-14 "A" 1 3
10 2021-10-15 "A" 0 0
11 2021-10-16 "A" 1 4
12 2021-10-17 "A" 0 0
13 2021-10-18 "A" 0 0
14 2021-10-06 "B" 0 0
15 2021-10-07 "B" 0 0
16 2021-10-08 "B" 0 0
17 2021-10-09 "B" 1 1
18 2021-10-10 "B" 1 1
19 2021-10-11 "B" 0 0
20 2021-10-12 "B" 0 0
21 2021-10-13 "B" 0 0
22 2021-10-14 "B" 0 0
23 2021-10-15 "B" 0 0
24 2021-10-16 "B" 1 2
25 2021-10-17 "B" 1 2
26 2021-10-18 "B" 1 2
My Idea was to group the dataframe and calculate the cumsum of variable response to have a similiar structure like in length of longest consecutive elements of sequence, so that I have for row 3 in cs_res=1 and for row 6 to 9 in cs_res=1,2,3,4. But the cumsum was calculated for the hole id. I hope you have a hint for me to find a function in R or how I can find a solution.
df1 <- data.frame(row = c(1:13),
date = seq.Date(as.Date("2021-10-06"), as.Date("2021-10-18"), "day"),
id = rep("A", times = 13),
response = c(1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 0),
want_group = c(1, 0, 2, 0, 0, 3, 3, 3, 3, 0, 4, 0, 0) )
df2 <- data.frame(row = c(14:26),
date = seq.Date(as.Date("2021-10-06"), as.Date("2021-10-18"), "day"),
id = rep("B", times = 13),
response = c(0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1),
want_group = c(0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2) )
df <- rbind(df1, df2)
df %>%
group_by(id, response) %>%
mutate(
cs_res = if_else(response == 1L, sequence(rle(response)$lengths), 0L)
)
"row" "id" "response" "cs_res"
1 "A" 1 1
2 "A" 0 0
3 "A" 1 2
4 "A" 0 0
5 "A" 0 0
6 "A" 1 3
7 "A" 1 4
8 "A" 1 5
9 "A" 1 6
10 "A" 0 0
11 "A" 1 7
12 "A" 0 0
13 "A" 0 0
14 "B" 0 0
15 "B" 0 0
.
.
.