5

I'm having trouble with the na.spline() function in the zoo package. Although the documentation explicitly states that this is an interpolation function, the behaviour I'm getting includes extrapolation.

The following code reproduces the problem:

require(zoo)
vector <- c(NA,NA,NA,NA,NA,NA,5,NA,7,8,NA,NA)
na.spline(vector)

The output of this should be:

NA NA NA NA NA NA  5  6  7  8  NA NA

This would be interpolation of the internal NA, leaving the trailing NAs in place. But, instead I get:

-1  0  1  2  3  4  5  6  7  8  9 10

According to the documentation, this shouldn't happen. Is there some way to avoid extrapolation?

I recognise that in my example, I could use linear interpolation, but this is a MWE. Although I'm not necessarily wed to the na.spline() function, I need some way to interpolate using cubic splines.

CaptainProg
  • 5,610
  • 23
  • 71
  • 116
  • The issue appears to be with `stats::spline`: `spline(seq_along(vector), vector, xout=seq_along(vector))`. It's inconsistent with `approx`, which strictly does interpolation. – Matthew Plourde Dec 14 '15 at 18:45
  • Achim has corrected the documentation in the development version of zoo. Since the problem is not in zoo itself and zoo tries to be consistent with the core of R there is no actual change to the code. – G. Grothendieck Dec 15 '15 at 21:27

1 Answers1

4

This behavior appears to be coming from the stats::spline function, e.g.,

spline(seq_along(vector), vector, xout=seq_along(vector))$y
# [1] -1  0  1  2  3  4  5  6  7  8  9 10

Here is a work around, using the fact that na.approx strictly interpolates.

replace(na.spline(vector), is.na(na.approx(vector, na.rm=FALSE)), NA)
# [1] NA NA NA NA NA NA  5  6  7  8 NA NA

Edit

As @G.Grothendieck suggests in the comments below, another, no doubt more performant, way is:

na.spline(vector) + 0*na.approx(vector, na.rm = FALSE)
Matthew Plourde
  • 43,932
  • 7
  • 96
  • 113