0

I am wondering how to efficiently locf by groups in a single R data.table from the last, i.e. filling in NA values backward from the last know value.

There is a code efficiently locf by groups in a single R data.table for forward direction but I am looking for the opposite direction. Any idea how to adjust the code?

  • For numeric values you could use `data.table::nafilll(...,type='nocb')` – Waldi Oct 07 '22 at 06:16
  • See `help("nafill")` and combine that with data.table's `by`. You can be more efficient if your data fulfills additional conditions. E.g., I'm working with data right now, where the last value of each group is guaranteed to be non-NA. That means I can just sort the data.table by group-ID and time and then simply use `setnafill`. – Roland Oct 07 '22 at 06:17
  • Thanks, that works, but it is pretty slow for large data compared to the code I referred to. However, I found a workaround: I sorted the data reversely, applied the code in the link and sorted the data back again. It made the trick, but I am wondering whether exists more efficient way without the two extra sorting steps. – Honza Šrámek Oct 08 '22 at 18:36
  • By the code I mean: `id_change = DT[, c(TRUE, id[-1] != id[-.N])] DT[, lapply(.SD, function(x) x[cummax(((!is.na(x)) | id_change) * .I)])]` – Honza Šrámek Oct 08 '22 at 18:44

1 Answers1

0

A bit workaround, but anyway: first, sort data reversely, apply the code to replace the NAs, sort back.

DT <- arrange(DT, desc(id))
id_change = DT[, c(TRUE, id[-1] != id[-.N])] 
DT <- DT[, lapply(.SD, function(x) x[cummax(((!is.na(x)) | id_change) * .I)])]
DT <- arrange(DT, id)