I'm trying to estimate some parameters across n factors in a data.table
. While I'm familiar with using the by
functionality to perform an operation by a factor; doing this for multiple sequential factors is causing some problems.
As an example, with the simplified set
df <- data.table(Group = c(rep("A", 2), rep("B", 3), rep("C", 2), rep("D", 4), "E", rep("F", 4)), Variable = round(rnorm(16), 2))
Group Variable
1: A 0.13
2: A 0.26
3: B -1.36
4: B -0.78
5: B -0.92
6: C 0.00
7: C -2.49
8: D -1.85
9: D 0.37
10: D -0.57
11: D 1.42
12: E -0.72
13: F -1.04
14: F 1.86
15: F 0.49
16: F 1.61
Using df[, mean(Variable), by = Group]
would give the mean for each Group. However, I'd like to calculate the mean for the previous n Groups.
I've tried using M[, zoo::rollapply(Variable, n, mean), by = Group]
, however, because the Groups are of different sizes using a fixed n will not work.
What would like is functionality akin to df[, mean(Variable), by = "This Group and previous n Groups]
.
The output I'm trying to achieve (for the case of n = 3) would look like
Group Variable
1: A NA
2: A NA
3: B NA
4: B NA
5: B NA
6: C 0.13
7: C 0.13
8: D -1.36
9: D -1.36
10: D -1.36
11: D -1.36
12: E 0
13: F -1.85
14: F -1.85
15: F -1.85
16: F -1.85
Any help would be appreciated.