Consider the following data:
df <- structure(list(date = structure(c(10904, 10613, 10801, 10849,
10740, 10680, 10780, 10909, 10750, 10814), class = "Date"), group = c(1L,
2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 2L)), class = "data.frame", row.names = c(NA,
-10L))
which gives:
date group
1 1999-11-09 1
2 1999-01-22 2
3 1999-07-29 1
4 1999-09-15 2
5 1999-05-29 1
6 1999-03-30 1
7 1999-07-08 1
8 1999-11-14 2
9 1999-06-08 2
10 1999-08-11 2
I now want to calculate
- a) the months that have been passed between two neighbouring dates per group (I know how to do this)
- b) flag rows when a certain time period (3 months) has passed and if it has passed, i would kind of reset and again look when 3 months have been passed from that date.
So for a) I'm doing this:
library(tidyverse)
library(lubridate)
df %>%
group_by(group) %>%
arrange(group, date) %>%
mutate(months_passed = time_length(interval(lag(date), date), "months"))
which gives:
# A tibble: 10 x 3
# Groups: group [2]
date group months_passed
<date> <int> <dbl>
1 1999-03-30 1 NA
2 1999-05-29 1 1.97
3 1999-07-08 1 1.3
4 1999-07-29 1 0.677
5 1999-11-09 1 3.35
6 1999-01-22 2 NA
7 1999-06-08 2 4.55
8 1999-08-11 2 2.10
9 1999-09-15 2 1.13
10 1999-11-14 2 1.97
But for b) I'm lost. What I want to do is:
- Look at each group separately.
- Calculate the months_passed between row 1 and row 2 (here: 1.97 months for group 1)
- If it is < 3 months, continue with the next row and calculate the time difference between the already passed months and the current time difference (here: 1.97 + 1.3 months).
- Now that the difference is bigger >= 3 months, I want to flag row 3.
- Now I reset the cumulative time difference and again start calculating the difference with the next row (here: 0.67 months) and so on.
Expected outcome would be:
# A tibble: 10 x 4
# Groups: group [2]
date group months_passed time_flag
<date> <int> <dbl> <int>
1 1999-03-30 1 NA 0
2 1999-05-29 1 1.97 0
3 1999-07-08 1 1.3 1
4 1999-07-29 1 0.677 0
5 1999-11-09 1 3.35 1
6 1999-01-22 2 NA 0
7 1999-06-08 2 4.55 1
8 1999-08-11 2 2.10 0
9 1999-09-15 2 1.13 1
10 1999-11-14 2 1.97 0
Any ideas?