Imagine a data.table
in R
like this
dtable = data.table(
id = c(1, 1, 1, 2, 2, 2),
time = c(1, 2, 3, 2, 3, 4),
value_a = c(NA, 'Yes', NA, 'No', NA, 'Yes'),
value_b = c('No', 'Yes', NA, NA, NA, NA)
)
cols <- c("value_a", "value_b")
which evaluates to
id time value_a value_b
1: 1 1 <NA> No
2: 1 2 Yes Yes
3: 1 3 <NA> <NA>
4: 2 2 No <NA>
5: 2 3 <NA> <NA>
6: 2 4 Yes <NA>
For each id
and time
I wish to expand the latest observed (<NA>
corresponds to no observation) value. I.e. I am searching an efficient method to create the resulting table:
id time value_a value_b
1: 1 1 <NA> No
2: 1 2 Yes Yes
3: 1 3 Yes Yes
4: 2 2 No <NA>
5: 2 3 No <NA>
6: 2 4 Yes <NA>
My dataset is very large so efficiency is important.