I am trying to implement the data.table
equivalent of a "drag value" in Excel. This is an example of the original data and desired_output
:
d <- data.table(input = c('bla',NA,NA,'ble',NA,NA,NA,NA,'ble',NA),
group = c(1,1,1,1,1,2,2,2,2,2),
desired_output = c('bla','bla','bla','ble','ble',NA,NA,NA,'ble','ble')
)
#> 1: bla 1 bla
#> 2: <NA> 1 bla
#> 3: <NA> 1 bla
#> 4: ble 1 ble
#> 5: <NA> 1 ble
#> 6: <NA> 2 <NA>
#> 7: <NA> 2 <NA>
#> 8: <NA> 2 <NA>
#> 9: ble 2 ble
#>10: <NA> 2 ble
This is, If a cell is filled out and the cell below is missing, replace it with the value above. Then repeat for subsequent rows. This should be done within gruops.
OBS: the by gruop makes this question different from the replacing NAs with latest nonNA value. One commented suggested the zoo::na.locf but it does not seem to wrok with the by clause:
#> d[, desired_output := zoo::na.locf(input), by = .(group)]
Error in `[.data.table`(d, , `:=`(desired_output, zoo::na.locf(input)), :
Supplied 2 items to be assigned to group 2 of size 5 in column 'desired_output'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.