Replacing NAs based on previous values & special rules

Question

I have a bunch of 10x2 tables that have missing values sandwiched between dates with existing values. I'm looking for the best way to infer the missing data from previous information. Example:

x1 <- c(1:10)
x2 <- c(NA, 'a', 'a', NA, 'a', 'b', 'b', NA, NA, 'c')
DF <- data.frame(x1,x2)
DF

x1   x2
1 <NA>
2    a
3    a
4 <NA>
5    a
6    b
7    b
8 <NA>
9 <NA>
10    c

I want to find missing values with the following algorithm:

Find the last instance of NA.
Work backwards to replace that NA with the first non-NA. Move to 2nd to last NA (etc.)
If there is no previous NA (as is the case with 1), then work forward to find first non-NA.

So final vector would be

a, a, a, a, a, b, b, b, b, c

I know I can get the list of NAs I want to replace with

Missing = rev(which(is.na(x2)))

and then use a for-loop from there. But I'll admit that I'm not that great of a programmer and would take me a long time to figure out (probably have to brute-force it). Is there a package that can easily sort this out, or a reference manual for these sorts of data clean-up issues?

Possibly related to this post? https://stackoverflow.com/questions/7735647/replacing-nas-with-latest-non-na-value — Z.Lin, Aug 26 '17 at 07:51
yup, seems to be a duplicate. personally I find this here to be the easiest to read solution for that problem: https://rdrr.io/cran/tidyr/man/fill.html from tidyr package — Jan, Aug 26 '17 at 07:54
Sorry about that, I really did try to look up previous entries but the only one I could find a vote of -9. I'll poke around these links, thanks. — CoolGuyHasChillDay, Aug 26 '17 at 08:16

score 0 · Answer 1 · edited Nov 19 '21 at 06:08

0

library(dplyr)
library(tidyr)
df <- data.frame(x1= c(1:10), x2= c(NA, 'a', 'a', NA, 'a', 'b', 'b', NA, NA, 'c'))
df1 <- df %>% fill(x2)

edited Nov 19 '21 at 06:08

Nimantha

6,405
6
28
69

answered Aug 26 '17 at 07:57

Prem

11,775
1
19
33

Replacing NAs based on previous values & special rules

1 Answers1