I want to calculate count of rows before current row in previous 1 year window by id.
Here's my data:
df <- structure(list(id = c("1", "1", "1", "1",
"2", "2", "2", "2", "2", "2", "2",
"2", "2"), flag = c(1, 1, 0, 1, 0, 0, 1, 1,
1, 1, 1, 1, 1), date = structure(c(15425, 15456, 16613,
16959, 15513, 15513, 15625, 15635, 15649, 15663, 15670, 16051,
16052), class = "Date")), sorted = "id", class = c("data.table",
"data.frame"), row.names = c(NA, -13L))
roll_sum <- c(0, 1, 0, 1, 0, 1, 2, 3, 4, 5, 6, 0, 1)
flag_sum <- c(0, 1, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 1)
df_desired <- cbind(df, roll_sum) # roll_sum: number of rows excluding current row in 1 year time frame rolling
df_desired <- cbind(df_desired, flag_sum) # flag_sum: number of rows excluding current row in 1 year time frame rolling where flag was 1
Data:
id flag date
1: 1 1 2012-03-26
2: 1 1 2012-04-26
3: 1 0 2015-06-27
4: 1 1 2016-06-07
5: 2 0 2012-06-22
6: 2 0 2012-06-22
7: 2 1 2012-10-12
8: 2 1 2012-10-22
9: 2 1 2012-11-05
10: 2 1 2012-11-19
11: 2 1 2012-11-26
12: 2 1 2013-12-12
13: 2 1 2013-12-13
Output:
df_desired
id flag date roll_sum flag_sum
1: 1 1 2012-03-26 0 0
2: 1 1 2012-04-26 1 1
3: 1 0 2015-06-27 0 0
4: 1 1 2016-06-07 1 0
5: 2 0 2012-06-22 0 0
6: 2 0 2012-06-22 1 0
7: 2 1 2012-10-12 2 0
8: 2 1 2012-10-22 3 1
9: 2 1 2012-11-05 4 2
10: 2 1 2012-11-19 5 3
11: 2 1 2012-11-26 6 4
12: 2 1 2013-12-12 0 0
13: 2 1 2013-12-13 1 1
I tried solution given by G. Grothendieck using zoo
in Compute rolling sum by id variables, with missing timepoints, but it's giving me an error:
Error in merge.zoo(z, g) : series cannot be merged with non-unique index entries in a series In addition: Warning message: In zoo(count, date) :
I made date column unique using make.index.unique
and make.time.unique
.
Any help with an optimized solution would be appreciated.Thanks.