1

I have the following data.table:

data.table(group = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A'), value = c(NA, 2, NA, 6, NA, 2, 6, NA))

   group value
1:     A    NA
2:     B     2
3:     B    NA
4:     A     6
5:     B    NA
6:     B     2
7:     A     6
8:     A    NA

I want to fill the NA values to be the same as the group's value that are not NA. The expected output is:

   group value
1:     A     6
2:     B     2
3:     B     2
4:     A     6
5:     B     2
6:     B     2
7:     A     6
8:     A     6

Any sugestions?

daniellga
  • 1,142
  • 6
  • 16
  • 1
    What if there are more than one that is not NA and they are not the same? – s_baldur Mar 08 '21 at 14:33
  • 1
    I suspect this is a duplicate of https://stackoverflow.com/q/7735647/3358272 and https://stackoverflow.com/q/31491976/3358272. I'd think `dat[, value:=zoo::na.locf(value,na.rm=FALSE), by=.(group)][, value:=zoo::na.locf(value,na.rm=FALSE,fromLast=TRUE), by=.(group)]` might suffice. – r2evans Mar 08 '21 at 14:36
  • 1
    If this is not an issue you could just use the first non-na: `DT[, value := first(value[!is.na(value)]), group]`. – s_baldur Mar 08 '21 at 14:37
  • 1
    It's also possible to nest the calls to `zoo::na.locf` into one `[` expression. – r2evans Mar 08 '21 at 14:37
  • 1
    @sindri_baldur thanks, that works. I was trying something similar to what r2evans suggested with data.table::nafill, but I thought there was an easier way to do it. – daniellga Mar 08 '21 at 14:48
  • 1
    I think the only "easier" way is to use a filling function that supports bidirectional filling in one step. `tidyr::fill(..., .direction="updown")` supports it for `tbl-df`, but it works on a whole frame and the fields within it, not on individual vectors. – r2evans Mar 08 '21 at 15:03
  • 1
    I have updated my solution using `nafill`. – Eric Mar 08 '21 at 15:07

1 Answers1

2

Here is a solution using nafill from data.table with type = 'nocb' and type = 'locf' to carry the values backward and forward.

library(data.table)

df <- data.table(group = c('A', 'B', 'B', 'A', 'B', 'B', 'A', 'A'), value = c(NA, 2, NA, 6, NA, 2, 6, NA))


df[ , value := nafill(nafill(value, type = 'nocb'), type = 'locf'), group]


Output:

group value
A     6
B     2
B     2
A     6
B     2
B     2
A     6
A     6

Original table:

group value
A     NA            
B     2         
B     NA            
A     6         
B     NA            
B     2         
A     6         
A     NA

Created on 2021-03-08 by the reprex package (v0.3.0)

Eric
  • 2,699
  • 5
  • 17