-1

replace NA value with the group value is not data.table-specific, although answer is similar, the answer below (to my question) is more elegant using data.table. HOW CAN I GET RID OF THE DUPLICATED TAG? THANK YOU

THIS IS DEFINITELY NO DUPLICATE OF Replacing NAs with latest non-NA value

I have a data.table like

library(data.table)
test <- data.table(id=c(1,1,1,2,2,2,3,3,3), A=c("Val1","Val1", NA, "Val2", NA, NA, NA, NA, "Val3"), B=c(1,NA,NA, 2,2,NA,NA,NA,3))
test   
    id    A  B
1:  1 Val1  1
2:  1 Val1 NA
3:  1   NA NA
4:  2 Val2  2
5:  2   NA  2
6:  2   NA NA
7:  3   NA NA
8:  3   NA NA
9:  3 Val3  3

with groups defined by "id".

I would like to populate the NA-Values with the values in their group. Do you see a solution?

The result should look like this (regarding character and numeric features):

   id    A B
1:  1 Val1 1
2:  1 Val1 1
3:  1 Val1 1
4:  2 Val2 2
5:  2 Val2 2
6:  2 Val2 2
7:  3 Val3 3
8:  3 Val3 3
9:  3 Val3 3

Thank you so much.

feinmann
  • 1,060
  • 1
  • 14
  • 20
  • 2
    Assuming you have the same non-NA value for every group. `ave(test$A, test$id, FUN = function(x) x[!is.na(x)])` and `ave(test$B, test$id, FUN = function(x) x[!is.na(x)])` – Ronak Shah Mar 22 '17 at 09:47

1 Answers1

2

Assuming that there are only unique elements per each 'id' for 'A' and 'B' columns, after grouping by 'id', loop through the Subset of Data.table (.SD) and get the first non-NA element and assign (:=) it to the columns

test[, (2:3) := lapply(.SD, function(x) x[!is.na(x)][1]) , id]

Or do a join on the dataset after removing the NA rows

test[na.omit(test), names(test)[-1] :=  mget(paste0("i.", names(test)[-1])) , on = .(id)]
akrun
  • 874,273
  • 37
  • 540
  • 662