drag value to leading NA cells (rolling NA replacement) by group in data.table

Question

I am trying to implement the data.table equivalent of a "drag value" in Excel. This is an example of the original data and desired_output:

d <- data.table(input          = c('bla',NA,NA,'ble',NA,NA,NA,NA,'ble',NA),
                group          = c(1,1,1,1,1,2,2,2,2,2),
                desired_output = c('bla','bla','bla','ble','ble',NA,NA,NA,'ble','ble')
                )
#> 1:   bla     1            bla
#> 2:  <NA>     1            bla
#> 3:  <NA>     1            bla
#> 4:   ble     1            ble
#> 5:  <NA>     1            ble
#> 6:  <NA>     2           <NA>
#> 7:  <NA>     2           <NA>
#> 8:  <NA>     2           <NA>
#> 9:   ble     2            ble
#>10:  <NA>     2            ble

This is, If a cell is filled out and the cell below is missing, replace it with the value above. Then repeat for subsequent rows. This should be done within gruops.

OBS: the by gruop makes this question different from the replacing NAs with latest nonNA value. One commented suggested the zoo::na.locf but it does not seem to wrok with the by clause:

#> d[, desired_output := zoo::na.locf(input), by = .(group)]
Error in `[.data.table`(d, , `:=`(desired_output, zoo::na.locf(input)),  : 
  Supplied 2 items to be assigned to group 2 of size 5 in column 'desired_output'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.

`d[, desired_output := zoo::na.locf(input), by = .(group)]`. — Gregor Thomas, Feb 01 '21 at 21:59
@GregorThomas, thank you. This fails for me with the by option (works without it. The error: `Error in `[.data.table`(d, , `:=`(desired_output, zoo::na.locf(input)), : Supplied 2 items to be assigned to group 2 of size 5 in column 'desired_output'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.` — LucasMation, Feb 01 '21 at 22:07
just add `na.rm=FALSE`, `d[, do := zoo::na.locf(input, na.rm=FALSE), by = .(group)]` — chinsoon12, Feb 01 '21 at 23:34

drag value to leading NA cells (rolling NA replacement) by group in data.table

0 Answers0