3

Is there a good way to carry the last observation of a row both forward and backwards n times? example vector, to demonstrate:

Before change:

vector <- c(NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA)

After change, for n=2:

vector <- c(NA, NA, NA, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, NA, NA, 3, 3, 3, 3, 3, NA)

dplyr::fill() doesn't seem to have a way to specify the number of filled rows, and zoo::na.locf() has a locb option, but only if you do not specify the number of rows you would like filled.

If there is a way to do this such that the locb and locf could be specified to be two different values, eg, 1 and 3, that would be perfect for me. But if there's not an easy way to do that then just an locb and locf of a specified number of rows. Thanks for any help! I usually work in dplyr but will accept any sort of solution as this problem is really stumping me.

LMc
  • 12,577
  • 3
  • 31
  • 43

3 Answers3

2

We can define a function. First, get the index of the non-NA elements. Second, expand the indexes for +-n elements, creating the new_indices. Finally, reassign (<<-) the corresponding values in a loop.

my_func <- function(vector, n){
    index <- which(!is.na(vector))
    new_indices <-lapply(index, (\(x) seq(from = x-n, to = x+n, by = 1)))
    mapply(\(x,y) `<<-`(vector[y], x), vector[index], new_indices)
    vector
    }

[1] NA NA NA  1  1  1  1  1  2  2  2  2  2 NA NA  3  3  3  3  3 NA NA
GuedesBF
  • 8,409
  • 5
  • 19
  • 37
2

There may be more elegant ways, but I wrote a user defined function that should do the trick:

myfun <- function(vec, n){
   seqV <- Vectorize(seq.default, vectorize.args = c("to", "from"))
   x <- which(!is.na(vector))
   ix <- as.vector(seqV(x - n, x + n))
   vec[ix] <- rep(vector[x], each = 1 + (n * 2))
   vec
}

myfun(vector, n = 2)

#  [1] NA NA NA  1  1  1  1  1  2  2  2  2  2 NA NA  3  3  3  3  3 NA NA
jpsmith
  • 11,023
  • 5
  • 15
  • 36
2

I think a simple for loop would help you accomplish this cleanly:

roll <- function(x, n) {
  idx <- which(!is.na(x))
  for (i in idx) x[pmax(i - n, 0):pmin(i + n, length(x))] <- x[i]
  return(x)
}

roll(vector, 2)
# [1] NA NA NA  1  1  1  1  1  2  2  2  2  2 NA NA  3  3  3  3  3 NA NA

The purpose of pmin and pmax is to preserve the length of your vector. For example, if you had a value in the last element and n = 2, you would not want to add two addition elements to your vector (See first column, last row of dplyr example below).

This function can then be easily applied within dplyr:

set.seed(123)
df <- replicate(5, sample(c(1:4, NA), 20, replace = T, prob = c(rep(0.02, 4), .92))) |>
  data.frame()

library(dplyr)

df |>
  mutate(across(where(is.numeric), ~ roll(.x, 2)))
#    X1 X2 X3 X4 X5
# 1  NA NA NA NA NA
# 2  NA  1 NA NA NA
# 3   4  1 NA NA NA
# 4   4  1 NA NA NA
# 5   4  1 NA NA  1
# 6   4  1 NA NA  1
# 7   4 NA NA NA  1
# 8  NA NA NA NA  1
# 9   4  2 NA NA  1
# 10  4  2 NA NA NA
# 11  4  2 NA NA NA
# 12  4  2 NA NA NA
# 13  4  2 NA NA NA
# 14 NA NA NA NA NA
# 15 NA NA NA NA NA
# 16 NA NA NA NA NA
# 17 NA NA NA NA NA
# 18  4 NA NA NA NA
# 19  4 NA NA NA NA
# 20  4 NA NA NA NA

I think it is useful to note that later values take precedence. For example, if an n is specified so that values carried forward and backwards overlap then later values will overwrite former values:

roll(vector, 3)
# [1] NA NA  1  1  1  1  1  2  2  2  2  2  2  2  3  3  3  3  3  3  3 NA

If you carry too far then values will be overwritten before their "turn" (here 2 is overwritten by 1 before it has a chance to be carried):

roll(vector, 5)
# [1] 1 1 1 1 1 1 1 1 1 1 1 1 3 3 3 3 3 3 3 3 3 3

These behaviors can be modified, but are the default with this function, FYI.

LMc
  • 12,577
  • 3
  • 31
  • 43
  • I prefer the cleanliness of *apply and *map, they are all loops anyway – GuedesBF Aug 24 '23 at 20:05
  • But this looks like the clean winner here. +1 – GuedesBF Aug 24 '23 at 20:06
  • Pretty much the same logic I used, but a good example of how for loops can actually look cleaner than applys. Reasingment on loops is a good example. `<<-` may look fancy, but is quite ugly... – GuedesBF Aug 24 '23 at 20:07