Positivity-preserving interpolation of time series in R

Question

I have some data with missing values that I know to be positive. I'm trying to interpolate the missing values using na.interp from the forecast package. However, some of the interpolated values turn out to be negative.

I've tried na.approx from the package zoo, but the approximated values do not agree with the seasonal trend of the time series.

I cannot interpolate in the log domain since some of my observations are 0. Interpolating in the square-root domain somehow produces too many outliers. Is there any other way to interpolate time series while preserving positivity? Any references to other R packages would also be appreciated.

@G.Grothendieck `na.StructTS` is taking too long on my around-8000-values-long time series, and I have nearly 100 such series. Any way to optimize this, maybe? — curious, Jun 10 '17 at 22:25
@G.Grothendieck also, passing my data as a zoo series to `na.StructTS` gives me this error: `Error in rowSums(tsSmooth(StructTS(y))[, -2]) : 'x' must be an array of at least two dimensions` — curious, Jun 10 '17 at 22:33
The error is coming from `rowSums`. Please review [ask] and [mcve]. — G. Grothendieck, Jun 10 '17 at 23:18
@Masoud thanks. `approx` worked best for me. The interpolation at some points is not as good as `na.interp`, but at least it preserved positivity :) Since that does answer the question, I'll accept an answer if you post one. — curious, Jun 11 '17 at 21:48
@curious if you post a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), I will post an answer. Without that, it's a little bit hard to make my own dataframe (laziness) ;) — M--, Jun 11 '17 at 22:13

Steffen Moritz · Answer 1 · 2021-11-11T23:38:33.360

There is the imputeTS package, which specifically focuses on missing values in time series. (take a look at this Paper)

It works like this:

na_kalman(yourTimeSeries)

That's it already.

It offers several time series imputation functions:

Imputation by Linear Interpolation
Imputation by Spline Interpolation
Imputation by Stineman Interpolation
Imputation by Structural Model & Kalman Smoothing
Imputation by ARIMA State Space Representation & Kalman Sm.
Imputation by Last Observation Carried Forward
Imputation by Next Observation Carried Backward
Missing Value Imputation by Simple Moving Average
Imputation by Linear Weighted Moving Average
Imputation by Exponential Weighted Moving Average
Missing Value Imputation by Mean Value
Seasonally Decomposed Missing Value Imputation
Seasonally Splitted Missing Value Imputation

Some of these functions are more advanced some are less advanced. I would try the na_kalman() function of the package for this task. Might be that the results of this function already adhere the constraints. Otherwise you need to perform some transformations before performing the imputation (as explained below).

In general if you want your imputation to be constrained to some bounds this transformation approach might also help:

library("imputeTS")

# Bounds
a <- 50
b <- 400

# Transform data
y <- log((myTimeSeries-a)/(b-myTimeSeries))
imputations <- na_kalman(y)

# Back-transform
imputationsBack <- (b-a)*exp(imputations)/(1+exp(imputations)) + a

Positivity-preserving interpolation of time series in R

1 Answers1