I have a question related to filtering on dates in R. I found e.g. this link dplyr filter on Date, which answers the question how to filter with help of dplyr in a specific date range. I would like to select a dynamic range, e.g. calculate the number of critical Jobs in a specific window e.g. the last seven days starting from the current date in the dataset. The code I have in mind would look something like this:
my.data %>%
group_by(category) %>%
filter(date > date - days(7) & date <= date) %>%
mutate(ncrit = sum(critical == 'yes'))
This is not working properly. Is there a way to get this running with dplyr?
Edit:
Apologies for the unclear post. To complete the post first the idea: imagine computers running jobs. If a computer fails to compute jobs the past x days it is more likely that it also fails in calculating the current job. A dummy dataset includes the computer categories (e.g. A/B), the date, and failure (yes/no)
Using the dataset from Rui Barradas, I would like to add with dplyr the following column 'number of critical Jobs in past 3 days" (in this case x = 3):
head(my.data, 7)
category date critical number of critical jobs in past 3 days
1 A 2018-08-14 yes NA
2 A 2018-08-15 no NA
3 A 2018-08-16 yes NA
4 A 2018-08-17 no 2
5 A 2018-08-18 yes 1
6 A 2018-08-19 no 2
7 A 2018-08-20 yes 1
Data (Rui Barradas):
set.seed(3635)
my.data <- data.frame(category = rep(c('A', 'B'), each = 10), #
date = rep(seq(Sys.Date() - 9, Sys.Date(), by = 'days')),
critical = sample(c('no', 'yes'), 20, TRUE))