Fill NA by mean values corresponding to unique in R

Question

I want to fill NA rows in a data table by the mean values for unique value from another column. Please see the intended output. How can I achieve this in R? I prefer the data table output.

data2 <- data.table(Plan=c(11,11,11,11,91,91,91,91), Price=c(4.4,4.4,4.4,NA,3.22,3.22,3.22,NA), factor=c(0.17,0.17,0.17,NA,0.15,0.15,0.15,NA), Type=c(4,4,4,4,3,3,3,3))

data2
   Plan Price factor Type
1:   11  4.40   0.17    4
2:   11  4.40   0.17    4
3:   11  4.40   0.17    4
4:   11    NA     NA    4
5:   91  3.22   0.15    3
6:   91  3.22   0.15    3
7:   91  3.22   0.15    3
8:   91    NA     NA    3



Output
       Plan Price factor Type
    1:   11  4.40   0.17    4
    2:   11  4.40   0.17    4
    3:   11  4.40   0.17    4
    4:   11  4.40   0.17    4
    5:   91  3.22   0.15    3
    6:   91  3.22   0.15    3
    7:   91  3.22   0.15    3
    8:   91  3.22   0.15    3

https://stackoverflow.com/questions/34124291/data-table-replace-na-with-mean-for-multiple-columns-and-by-id — M--, Dec 11 '19 at 16:21
I'm not sure how much this matters, but in your example each group has all the same values for each column, so the fact that you want a mean won't actually be evident — camille, Dec 11 '19 at 16:32
[Possible duplicate](https://stackoverflow.com/a/58593144/2204410) — Jaap, Dec 11 '19 at 17:32

score 0 · Accepted Answer · answered Dec 11 '19 at 16:10

We can use na.locf grouped by 'Plan' to change the NA with non-NA preceding values

library(zoo)
data2[, factor := na.locf(factor), by = Plan]

If we need mean, use na.aggregate

data2[, factor := na.aggregate(factor), by = Plan]

For multiple columns

nm1 <- c("Price", "factor"_
data2[, (nm1) := lapply(.SD, na.aggregate), by = Plan, .SDcols = nm1]

Fill NA by mean values corresponding to unique in R

1 Answers1