library(data.table)
set.seed(123)
dt = data.table( grp=round(runif(10)), val=c(runif(4), NA, runif(4), NA) )
dt
Output is:
grp val
1: 0 0.95683335
2: 1 0.45333416
3: 0 0.67757064
4: 1 0.57263340
5: 1 NA
6: 0 0.10292468
7: 1 0.89982497
8: 1 0.24608773
9: 1 0.04205953
10: 0 NA
I'd like to fill the val
with the previous non-NA value of val
.
The SO question "Replacing NAs with latest non-NA value" has an amazing SO answer, which I do not fully comprehend. Nonetheless, I tried:
dt[ , val2 := val[1], .(grp, cumsum(!is.na(val))) ]
dt
Output is:
grp val val2
1: 0 0.95683335 0.95683335
2: 1 0.45333416 0.45333416
3: 0 0.67757064 0.67757064
4: 1 0.57263340 0.57263340
5: 1 NA 0.57263340
6: 0 0.10292468 0.10292468
7: 1 0.89982497 0.89982497
8: 1 0.24608773 0.24608773
9: 1 0.04205953 0.04205953
10: 0 NA NA
This almost works (it correctly filled in row 5). Why does the 10th row of dt
still have an NA
value a val2
instead of 0.10292468
(the previous non-NA value for grp == 0
)?