It is completely sensible to use y[t-1]
or y[t-n]
for some n > 0
to predict y[t]
. You shouldn't, though, use y[t]
to try and predict y[t]
, as you probably don't know ahead of time that which you are trying to predict.
In fact, in the example you gave (page 2), the variable traffic_volume
which we predict for exists also in the input sequence, so the example you are looking for is exactly that, if I understand you correctly. The function custom_ts_multi_data_prep()
adds, for each time step, the data from previous time steps into X
and the following time steps into y
.(*)
That data is also implicitly encoded in the activations of the LSTM itself - LSTM is a type of recurrent network which encodes the data it has seen up until now as input for the next step of the prediction process. However, it may be very logical to incorporate true data from previous time steps into the prediction process for a few reasons:
- The model's state that is passed on to the next prediction step is only a partial view of the true state, and knowing the actual progression in the "real world" may be critical for predicting the next step.
- Similar to the rationale behind residual skip connections in CNNs, adding the "raw" value of the previous time step maybe help the model by focusing on only the residual problem - how to get from
y[t-1]
to y[t]
, while using x[t]
(or x[t-1]
, depending on your specific problem), rather than performing the jump from x[t]
to y[t]
with no true data from previous time steps.
Having said that, adding almost any feature from the system will likely make your model "better" and more prone to overfit, so take this into consideration when choosing which item this wisely.
(*) small remark: note that in this specific example they leave a gap of one time stpe that doesn't appear here nor there, and I am not sure if it is intentional or a mistake (X
contains data from i-window
to i-1
while y
contains data from i+1
to i+horizon
, while i
isn't included in either -- this might be a misunderstanding of the author about how range()
works; or I might be missing something).