1

I need help on how to use the WindowSummarizer function from the SKTIME library recursively in the future.

Because in the training and test data it is easy to generate the variables of moving averages and moving standard deviations.

But for future predictions, the WindowSummarizer function has to work interactively, with each new predicted observation, WindowSummarizer must recalculate all variables and thus make the next prediction, recursively, until the end.

Can anyone help me?

Below a reproducible example:

#!pip install sktime[all_extras]

import pandas as pd
import numpy as np
from sktime.transformations.series.summarize import WindowSummarizer
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.compose import ForecastingPipeline
from sklearn.linear_model import LinearRegression
from sktime.datasets import load_airline
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.forecasting.compose import make_reduction, TransformedTargetForecaster

y = load_airline()

kwargs = {
        "lag_feature": {
            "lag": [0,1],
            "mean": [[0, 7], [0, 14],[0, 28]],
            "std": [[0, 7], [0, 14],[0, 28]],
            "kurt": [[0, 7], [0, 14],[0, 28]],
            "skew": [[0, 7], [0, 14],[0, 28]]
        }
    }


forecaster = make_reduction(
    LinearRegression(),
    scitype="tabular-regressor",
    transformers=[WindowSummarizer(**kwargs, n_jobs=1,truncate = "bfill")],
    window_length=None,
    strategy="recursive"
)

pipe = ForecastingPipeline(
    steps=[
        ("forecaster", forecaster),
    ]
)

model = pipe.fit(y)

model.predict(fh = np.arange(1,100,1))

This code return the error message:

Input contains NaN, infinity or a value too large for dtype('float64').
desertnaut
  • 57,590
  • 26
  • 140
  • 166
  • @Marion Chavas did you find a solution to this? I am having the same issue. I know that modeltime has this ability (https://business-science.github.io/modeltime/reference/recursive.html) in R. But I was hoping sktime could do the same thing. – Comte Nov 01 '22 at 15:17

1 Answers1

0

The issue here is that you are using ForecastingPipeline instead of TransformedTargetForecaster. According to the documents..."Pipeline for forecasting with exogenous data.ForecastingPipeline is only applying the given transformers to X. The forecaster can also be a TransformedTargetForecaster containing transformers to transform y"

So a simple change to TransformedTargetForecaster works for me with the code you posted...

y = load_airline()

kwargs = {
        "lag_feature": {
            "lag": [0,1],
            "mean": [[0, 7], [0, 14],[0, 28]],
            "std": [[0, 7], [0, 14],[0, 28]],
            "kurt": [[0, 7], [0, 14],[0, 28]],
            "skew": [[0, 7], [0, 14],[0, 28]]
        }
    }
from sklearn.linear_model import LinearRegression

forecaster2 = make_reduction(
    LinearRegression(),
    scitype="tabular-regressor",
    transformers=[WindowSummarizer(**kwargs, n_jobs=1,truncate = "bfill")],
    window_length=None,
    strategy="recursive"
)

pipe = TransformedTargetForecaster(
    steps=[
        ("forecaster", forecaster2),
    ]
)

model = pipe.fit(y)

model.predict(fh = np.arange(1,100,1))

You can also check the exogenous variables created by the WindowSummarizer using some code like this...

transformer = WindowSummarizer(**kwargs, n_jobs=1,truncate = "bfill")
y_transformed = transformer.fit_transform(y)

Hope that helps.

Comte
  • 159
  • 1
  • 10