Function / Loop to Replace NA with values in adjacent columns in R

Question

I have a time series dataset with 1000 columns. Each row, is of course, a different record. There are some NA values that are scattered throughout the dataset.

I would like to replace each NA with either the adjacent left-value or the adjacent-right value, it doesn't matter which.

A neat solution and one which I was going for is to replace each NA with the value to its right, unless it is in the last column, in which case replace it with the value to its left.

I was just going to do a for loop, but I assume a function would be more efficient. Essentially, I wasn't sure how to reference the adjacent values.

Here is what I was trying:

for (entry in dataset) {
  if (any(is.na(entry)) == TRUE && entry[,1:999]) {
    entry = entry[,1]
  }
  else if (any(is.na(entry)) == TRUE && entry[,1000]) {
    entry = cell[,-1]
  }
}

As you can tell, I'm inexperienced with R :) Not really sure how you index the values to the left or right.

A small example will go a long way – Pierre L Jul 19 '16 at 16:11 — Pierre L, Jul 19 '16 at 16:11

agenis · Accepted Answer · 2016-07-19T16:47:50.330

I would suggest using na.locf on the transposed of your dataset.

The na.locf function of the zoo package is designed to replace NA by the closest value (+1 or -1 in the same row). Since you want the columns, we can just transpose first the dataset:

library(zoo)
df=matrix(c(1,3,4,10,NA,52,NA, 11, 100), ncol=3)
step1 <-  t(na.locf(t(df), fromLast=T))
step2 <-  t(na.locf(t(step1), fromLast=F))
print(df)
#### [1,]    1   10   NA
#### [2,]    3   NA   11
#### [3,]    4   52  100
print(step2)
#### [1,]    1   10   10
#### [2,]    3   11   11
#### [3,]    4   52  100

I do it in 2 steps since there is a different treatment for inside columns and last column. If you know the dplyr package it's even more straightforward to turn it into a function:

library(dplyr)
MyReplace = function(data) {data %>% t %>% na.locf(.,,T) %>% na.locf %>% t}
MyReplace(df)

Function / Loop to Replace NA with values in adjacent columns in R

1 Answers1

Linked

Related