0

I have a simple time series and I have a code implementing the moving average:

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

keras = tf.keras

def plot_series(time, series, format="-", start=0, end=None, label=None):
    plt.plot(time[start:end], series[start:end], format, label=label)
    plt.xlabel("Time")
    plt.ylabel("Value")
    if label:
        plt.legend(fontsize=14)
    plt.grid(True)
    
def trend(time, slope=0):
    return slope * time

def seasonal_pattern(season_time):
    """Just an arbitrary pattern, you can change it if you wish"""
    return np.where(season_time < 0.4,
                    np.cos(season_time * 2 * np.pi),
                    1 / np.exp(3 * season_time))

def seasonality(time, period, amplitude=1, phase=0):
    """Repeats the same pattern at each period"""
    season_time = ((time + phase) % period) / period
    return amplitude * seasonal_pattern(season_time)

def white_noise(time, noise_level=1, seed=None):
    rnd = np.random.RandomState(seed)
    return rnd.randn(len(time)) * noise_level

time = np.arange(4 * 365 + 1)

slope = 0.05
baseline = 10
amplitude = 40
series = baseline + trend(time, slope) + seasonality(time, period=365, amplitude=amplitude)

noise_level = 5
noise = white_noise(time, noise_level, seed=42)

series += noise

plt.figure(figsize=(10, 6))
plot_series(time, series)
plt.show()

def moving_average_forecast(series, window_size):
  """Forecasts the mean of the last few values.
     If window_size=1, then this is equivalent to naive forecast"""
  forecast = []
  for time in range(len(series) - window_size):
    forecast.append(series[time:time + window_size].mean())
  return np.array(forecast)

split_time = 1000
time_train = time[:split_time]
x_train = series[:split_time]
time_valid = time[split_time:]
x_valid = series[split_time:]

moving_avg = moving_average_forecast(series, 30)[split_time - 30:]

plt.figure(figsize=(10, 6))
plot_series(time_valid, x_valid, label="Series")
plot_series(time_valid, moving_avg, label="Moving average (30 days)")

I am not getting this part:

for time in range(len(series) - window_size):
    forecast.append(series[time:time + window_size].mean())
  return np.array(forecast)

What I do not understand is how series[time:time + window_size] works? Window_size is given into the function and can be a value specifying how many days are considered to calculate the mean, like 5 or 30 days.

When I try something similiar to illustrate this to myself, like plot(series[time:time + 30]) this does not work.

Furthermore I do not get how len(series) - window_size) works.

Stat Tistician
  • 813
  • 5
  • 17
  • 45
  • Does this answer your question? [Moving average or running mean](https://stackoverflow.com/questions/13728392/moving-average-or-running-mean) – Joe Jun 26 '20 at 16:43
  • https://stackoverflow.com/questions/14313510/how-to-calculate-moving-average-using-numpy – Joe Jun 26 '20 at 16:43

2 Answers2

0

debug your code and add some print statements to see how it is responding Write them down and try to analyze the results Step back and write a similar code that reproduce the same output Compare if it is the same congrats if it is no then try to run again with timers on and see which one is faster. if your code is faster the congrats.

0

Seems like the function moving_average_forecast simply calculates the x day rolling average? If that is the intention then:

  • The line for time in range(len(series) - window_size): gives you the index time that goes from 0 to some number n where n + 1 is the number of rolling averages you can get out of a time series of size N (i.e. if you have 11 data points and want to calculate 10 day rolling averages, you can get at most 2, here N = 11 = len(series), window_size = 10, so n = 1 and time = [0, 1]
  • The line series[time:time + window_size] I think should actually be series[time:time + window_size - 1] simply index into your data contained in series and calculate each of the rolling averages (i.e. using our example earlier, in the first iteration time = 0, time + window_size - 1 = 9 so series[time:time + window_size - 1] returns an array with the first 10 data points and so on

Hope that helps.

Jimmy
  • 187
  • 7