This question is a continuation of this one.
Given a data.table, I would like to extract cumulative unique elements until it reachs three unique values OR when triggered by t
, than reset and resume:
y <- data.table( a = c (1, 2, 2, 3, 3, 4, 3, 2, 2, 5, 6, 7, 9, 8)
, t = c (F, F, F, F, F, F, F, T, F, F, F, F, F, F))
The derired output is:
a t output
1 FALSE 1
2 FALSE 1 2
2 FALSE 1 2
3 FALSE 1 2 3
3 FALSE 1 2 3
4 FALSE 4 # 4 is the forth element, so it resets and start again
3 FALSE 3 4
2 TRUE 2 # because `t` is `TRUE` it resets and start again
2 FALSE 2
5 FALSE 2 5
6 FALSE 2 5 6
7 FALSE 7 # 7 is the forth element, so it resets and start again
9 FALSE 7 8
8 FALSE 7 8 9
Based on "thelatemail" solution in the link, I tried the following function:
unionlim_trigger <- function(x,y,n=4, trigger = FALSE) {
u <- union(x,y)
if(length(u) == n | trigger == TRUE) y else u
}
However, when I apply through:
y[, out := sapply(Reduce(function(x,y,trigger) unionlim_trigger(x=x, y = y, trigger = t), a, accumulate=TRUE), paste, collapse=" ")]
I get the warnings:
In if (length(u) == n | trigger == T) y else u :
the condition has length > 1 and only the first element will be used
I understand that this happens because instead of passing the i-th element of t
, I am passing the whole vector.
How do I solve that? I tried using mapply and the instruction below, with no success:
y[, out := sapply(Reduce(function(x,y,trigger) unionlim_trigger(x=x, y = y, trigger = t[.I]), a, accumulate=TRUE), paste, collapse=" ")]