For those that don't want a zoo package dependency, here's a simple version using run level encoding from base. The idea is simple, we just use the rle()
and replace the NA
s in the index with whatever is to the left (i.e. preceding value), and we skip any first NA
s because there is no value to the left. Then we use inverse.rle()
to get back to a full length vector. For reverse operation (, we just reverse the vector before and after. I haven't done benchmarking, but since all operations are vectorized, it should be quite fast.
rle()
doesn't group NA
s for some reason. Docs state "Missing values are regarded as unequal to the previous value, even if that is also missing.". Hence why I recode the NA
s to a temporary string value, and have to convert the vector back to the right class. Not perfect, but works for most cases.
#' Last observation carried forward
#'
#' @param x A vector
#' @param reverse Whether to do it in reverse
#'
#' @return A vector
#' @export
#'
#' @examples
#' c(NA, 1, NA, 2, NA) %>% locf()
#' c(NA, 1, NA, 2, NA) %>% locf(reverse = T)
locf = function(x, reverse = F) {
#reverse?
if (reverse) x = rev(x)
#recode NA
#these are kept distinct by rle() by default for same reason ???
x_class = class(x)
x[is.na(x)] = "___tmp"
#run level encoding
x_rle = rle(x)
#swap values for NAs
which_na = which(x_rle$values == "___tmp")
#skip 1st
which_na = setdiff(which_na, 1)
#replace values
x_rle$values[which_na] = x_rle$values[which_na - 1]
#back to normal
y = inverse.rle(x_rle)
#NA recode
y[y == "___tmp"] = NA
#fix type/class
if (x_class[1] == "logical") y = as.logical(y)
if (x_class[1] == "integer") y = as.integer(y)
if (x_class[1] == "numeric") y = as.double(y)
if (x_class[1] == "factor") y = factor(y, levels = levels(x))
if (x_class[1] == "ordered") y = ordered(y, levels = levels(x))
#reverse?
if (reverse) y = rev(y)
y
}
Tests:
> c(NA, 1, NA, 2, NA) %>% rle()
Run Length Encoding
lengths: int [1:5] 1 1 1 1 1
values : num [1:5] NA 1 NA 2 NA
> c(NA, 1, NA, 2, NA) %>% rle() %>% str()
List of 2
$ lengths: int [1:5] 1 1 1 1 1
$ values : num [1:5] NA 1 NA 2 NA
- attr(*, "class")= chr "rle"
> #swap the values for one to left
> #reverse rle
> c(NA, 1, NA, 2, NA) %>% locf()
[1] NA 1 1 2 2
> c(NA, 1, NA, 2, NA) %>% locf(reverse = T)
[1] 1 1 2 2 NA
> c(NA, 1, NA, 2, NA, NA, NA) %>% locf()
[1] NA 1 1 2 2 2 2