I am trying to determine a robust way to determine for how many historical rows a condition has been true, for each ID in a table.
Here is the example data:
DT <- data.table(update_date = rep(c("2022-01-01", "2022-01-02",
"2022-01-03", "2022-01-04",
"2022-01-05", "2022-01-06"), times = 2),
ID = c(rep("aapl", times = 6), rep("ibm", times = 6)),
b = c("U1", "U1", "U1", "U2", "U2", "U2", "D1", "D2", "D1", "D3", "D2", "D3") )
DT[, update_date := as.Date(update_date)]
update_date ID b
1: 2022-01-01 aapl U1
2: 2022-01-02 aapl U1
3: 2022-01-03 aapl U1
4: 2022-01-04 aapl U2
5: 2022-01-05 aapl U2
6: 2022-01-06 aapl U2
7: 2022-01-01 ibm D1
8: 2022-01-02 ibm D2
9: 2022-01-03 ibm D1
10: 2022-01-04 ibm D3
11: 2022-01-05 ibm D2
12: 2022-01-06 ibm D3
What I need to calculate is for each row, how long the value in b has existed, done by ID.
So for ID == 'aapl'
, the value for row 6 would be 3, as the value "U2" has existed for 3 days (or rows). The value for row 5 would be 2. The value for row 3 would be 3 again.
For the ID == 'ibm', row 12 would have 1. Row 11 would have 1 as well as"D2" has only been true for 1 day (or row).
I can loop through each ID, and day, and look backward. I'm just wondering if there is a more concise way to do this than row by row.