Why my LSTM model is repeating the previous values?

Question

I build a simple LSTM model in Keras as below:

model = Sequential()
model.add(keras.layers.LSTM(hidden_nodes, input_dim=num_features, input_length=window, consume_less="mem"))
model.add(keras.layers.Dense(num_features, activation='sigmoid'))
optimizer = keras.optimizers.SGD(lr=learning_rate, decay=1e-6, momentum=0.9, nesterov=True)

When I apply the model on some data I have this particular behaviour:

Where the orange line represents the predicted values and the blue one the grand truth.

As you can see, the network repeats previous values but it's not what I want. I have several features (not only the one shown in the pictures) and I want the network takes into account the dependencies with other time series instead of look just at past data of a single and repeats previous data.

I hope the questions is clear enough!

My data
I have 36 time series (categorical and numerical data). I use a window of length W and I resheape the data in order to create a numpy vector in the form required in Keras (num_samples, window, num_features).

Edit 1
Sample of data:

0.5, 0.1, 0.4, 1, 0,74
0.1, 0.1, 0.8, 0.9, 0,8
0.2, 0.3, 0.5, 1, 0,85

I have one categorical and two numerical attributes. The first three rows refer to the categorical one (one-hot encoding for categorical). Last two refer to two numerical attributes.

I build training and test as shown below:

So I execute model.fit(T, X).

I've also tried with a low number of Hidden Nodes but the result it's the same.

Edit 2
The custom loss function that takes into account the use of numerical and categorical features:

def mixed_num_cat_loss_backend(y_true, y_pred, signals_splits):
    if isinstance(y_true, np.ndarray):
        y_true = keras.backend.variable( y_true )
    if isinstance(y_pred, np.ndarray):
        y_pred = keras.backend.variable( y_pred )

    y_true_mse = y_true[:,:signals_splits[0]] 
    y_pred_mse = y_pred[:,:signals_splits[0]]
    mse_loss_v = keras.backend.square(y_true_mse-y_pred_mse)

    categ_loss_v = [ keras.backend.categorical_crossentropy(
                         y_pred[:,signals_splits[i-1]:signals_splits[i]], 
                         y_true[:,signals_splits[i-1]:signals_splits[i]], 
                         from_logits=False) # force keras to normalize
                   for i in range(1,len(signals_splits)) ]

    losses_v = keras.backend.concatenate( [mse_loss_v, keras.backend.stack(categ_loss_v,1)], 1)

    return losses_v

I use model.fit(T, X) in order to know where the numerical features are (in the matrix).

This is the function that prepares data by starting from a 2D numpy array, as shown in the pictures with M,T,X:

def prepare_training_data(data_matrix, boundaries, window = 5):

    num_rows, num_columns = data_matrix.shape
    effective_sizes = [max(0,(nrows - window)) for nrows in boundaries]
    total_training_rows = sum(effective_sizes)

    print " - Skipped dumps because smaller than window:", sum([z==0 for z in effective_sizes])

    # prepare target variables
    T = data_matrix[window:boundaries[0],:]

    start_row = boundaries[0]
    for good_rows, total_rows in zip(effective_sizes[1:],boundaries[1:]):
        if good_rows>0:
            T = np.vstack( (T,data_matrix[start_row+window:start_row+total_rows,:]) )
        start_row += total_rows
        # check concatenate

    # training input to the LSTM
    X = np.zeros((total_training_rows, window, num_columns))
    curr_row = 0
    curr_boundary = 0
    for good_rows, total_rows in zip(effective_sizes,boundaries):
        for i in range(good_rows):
            X[curr_row] = data_matrix[curr_boundary+i:curr_boundary+i+window,:]
            curr_row += 1
        curr_boundary += total_rows

    return X,T,effective_sizes

What is your data and how did you train it? If you passed the same data as input and output, it will definitely learn to copy the input. — Daniel Möller, Dec 05 '17 at 16:16
@DanielMöller I edited the question. I don't understand what you mean when you say "same data as input and output". — Alessandro, Dec 05 '17 at 21:33
It would be helpful to see a small subset of the data you pass as input and output to `model.fit`, or at least the code you use to generate the arrays. Could it be possible that your model is overfitting, either through having insufficient data, `hidden_nodes` being too high (or simply a bug in the training data)? — Phil, Dec 06 '17 at 00:03
Well, the correct shape should be `(num_samples, window, num_features)`. — Daniel Möller, Dec 06 '17 at 11:21
Thanks @DanielMöller, I made a mistake in the question. I added some information about data preparation. — Alessandro, Dec 06 '17 at 11:33
So, you're trying to predict the past instead of the future? It does sound like it should be `model.fit(T,X)`. — Daniel Möller, Dec 06 '17 at 11:37
Yes, because I'm going to build a system of anomaly detection. — Alessandro, Dec 06 '17 at 11:47
LSTMs cannot predict the past, they follow a "sequence". The order matters. They predict the future. You must at least invert the order of your data if you're going to predict the past. — Daniel Möller, Dec 06 '17 at 13:20
I'm confused, how can I call `model.fit(T, X)` ? Keras wants a numpy array in the form `(num_samples, window, num_features)`. If you need more information let me know, maybe I misunderstood your point. — Alessandro, Dec 06 '17 at 13:50
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/160617/discussion-between-ghemon-and-daniel-moller). — Alessandro, Dec 06 '17 at 13:57
I'm curious about a loss function with 3 arguments. How are you passing it to the model? — Daniel Möller, Dec 11 '17 at 10:30

Why my LSTM model is repeating the previous values?

0 Answers0

Linked