Turn NA into values in a for loop

Question

I used the following code to turn each i-th NA in the variable x into the (i-1)-th value of the variable x and it works fine but it takes too much time, since the dataset is large.

for (i in 2:nrow(data_final)) {
  data_final$COD_ATC5[i] <- ifelse(is.na(data_final$COD_ATC5[i]), data_final$COD_ATC5[i-1], data_final$COD_ATC5[i])
}

Do you have other faster idea?

Here a reproducible example of the dataset:

data_final <- data.frame(ID=c(rep("01",12),rep("02",12)), t = rep(1:12,2), x= c(rep("A",4),NA,rep("A",3),rep("C",4),rep("A",5),rep("C",3),NA,"C",rep("A",2)))

use `tidyr::fill` alternatively – AnilGoyal May 21 '21 at 07:58 — AnilGoyal, May 21 '21 at 07:58

score 0 · Answer 1 · answered May 21 '21 at 08:47

We can determine the indices idx first using which and then replace only these indices with [idx-1]. The function ByWhich shows how it works.

# Sample data
data_final <- data.frame(ID=c(rep("01",12),rep("02",12)), t = rep(1:12,2), x= c(rep("A",3), "B", NA, rep("A",3),rep("C",4),rep("A",5),rep("C",3),NA,"C",rep("A",2)))

# New solution
ByWhich <- function(x) {
  idx <- which(is.na(x))
  x[idx] <- x[idx-1]
  return(x)
}

# Solution by asker
ByLoop <- function(x) {
  for (i in 2:length(x)) {
    x[i] <- ifelse(is.na(x[i]), x[i-1], x[i])
  }
  return(x)
}

# Test if the functions provide equal solutions
all(ByLoop(data_final$x) == ByWhich(data_final$x))
#> [1] TRUE

The benchmark shows that the solution using which is faster by about 40%.

library(microbenchmark)
microbenchmark::microbenchmark(
  ByWhich = ByWhich(data_final$x),
  ByLoop  = ByLoop(data_final$x)
)
#> Unit: microseconds
#>     expr    min      lq     mean  median      uq      max neval
#>  ByWhich  2.001  2.1010 23.60294  2.4010  2.5010 2124.802   100
#>   ByLoop 35.400 36.2515 37.16908 37.0005 37.5015   42.301   100

This solution does not require an extra package. However, the zoo or tidyverse solutions provided in the comments are probably even faster.

^{Created on 2021-05-21 by the reprex package (v2.0.0)}

Turn NA into values in a for loop

1 Answers1