I have a bunch of 10x2 tables that have missing values sandwiched between dates with existing values. I'm looking for the best way to infer the missing data from previous information. Example:
x1 <- c(1:10)
x2 <- c(NA, 'a', 'a', NA, 'a', 'b', 'b', NA, NA, 'c')
DF <- data.frame(x1,x2)
DF
x1 x2
1 <NA>
2 a
3 a
4 <NA>
5 a
6 b
7 b
8 <NA>
9 <NA>
10 c
I want to find missing values with the following algorithm:
- Find the last instance of NA.
- Work backwards to replace that NA with the first non-NA. Move to 2nd to last NA (etc.)
- If there is no previous NA (as is the case with 1), then work forward to find first non-NA.
So final vector would be
a, a, a, a, a, b, b, b, b, c
I know I can get the list of NAs I want to replace with
Missing = rev(which(is.na(x2)))
and then use a for-loop from there. But I'll admit that I'm not that great of a programmer and would take me a long time to figure out (probably have to brute-force it). Is there a package that can easily sort this out, or a reference manual for these sorts of data clean-up issues?