5

I build a simple LSTM model in Keras as below:

model = Sequential()
model.add(keras.layers.LSTM(hidden_nodes, input_dim=num_features, input_length=window, consume_less="mem"))
model.add(keras.layers.Dense(num_features, activation='sigmoid'))
optimizer = keras.optimizers.SGD(lr=learning_rate, decay=1e-6, momentum=0.9, nesterov=True)

When I apply the model on some data I have this particular behaviour: enter image description here

Where the orange line represents the predicted values and the blue one the grand truth.

As you can see, the network repeats previous values but it's not what I want. I have several features (not only the one shown in the pictures) and I want the network takes into account the dependencies with other time series instead of look just at past data of a single and repeats previous data.

I hope the questions is clear enough!

My data
I have 36 time series (categorical and numerical data). I use a window of length W and I resheape the data in order to create a numpy vector in the form required in Keras (num_samples, window, num_features).

Edit 1
Sample of data:

0.5, 0.1, 0.4, 1, 0,74
0.1, 0.1, 0.8, 0.9, 0,8
0.2, 0.3, 0.5, 1, 0,85

I have one categorical and two numerical attributes. The first three rows refer to the categorical one (one-hot encoding for categorical). Last two refer to two numerical attributes.

I build training and test as shown below: enter image description here

So I execute model.fit(T, X).

I've also tried with a low number of Hidden Nodes but the result it's the same.

Edit 2
The custom loss function that takes into account the use of numerical and categorical features:

def mixed_num_cat_loss_backend(y_true, y_pred, signals_splits):
    if isinstance(y_true, np.ndarray):
        y_true = keras.backend.variable( y_true )
    if isinstance(y_pred, np.ndarray):
        y_pred = keras.backend.variable( y_pred )

    y_true_mse = y_true[:,:signals_splits[0]] 
    y_pred_mse = y_pred[:,:signals_splits[0]]
    mse_loss_v = keras.backend.square(y_true_mse-y_pred_mse)

    categ_loss_v = [ keras.backend.categorical_crossentropy(
                         y_pred[:,signals_splits[i-1]:signals_splits[i]], 
                         y_true[:,signals_splits[i-1]:signals_splits[i]], 
                         from_logits=False) # force keras to normalize
                   for i in range(1,len(signals_splits)) ]

    losses_v = keras.backend.concatenate( [mse_loss_v, keras.backend.stack(categ_loss_v,1)], 1)

    return losses_v

I use model.fit(T, X) in order to know where the numerical features are (in the matrix).

This is the function that prepares data by starting from a 2D numpy array, as shown in the pictures with M,T,X:

def prepare_training_data(data_matrix, boundaries, window = 5):

    num_rows, num_columns = data_matrix.shape
    effective_sizes = [max(0,(nrows - window)) for nrows in boundaries]
    total_training_rows = sum(effective_sizes)

    print " - Skipped dumps because smaller than window:", sum([z==0 for z in effective_sizes])

    # prepare target variables
    T = data_matrix[window:boundaries[0],:]

    start_row = boundaries[0]
    for good_rows, total_rows in zip(effective_sizes[1:],boundaries[1:]):
        if good_rows>0:
            T = np.vstack( (T,data_matrix[start_row+window:start_row+total_rows,:]) )
        start_row += total_rows
        # check concatenate

    # training input to the LSTM
    X = np.zeros((total_training_rows, window, num_columns))
    curr_row = 0
    curr_boundary = 0
    for good_rows, total_rows in zip(effective_sizes,boundaries):
        for i in range(good_rows):
            X[curr_row] = data_matrix[curr_boundary+i:curr_boundary+i+window,:]
            curr_row += 1
        curr_boundary += total_rows

    return X,T,effective_sizes
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
Alessandro
  • 742
  • 1
  • 10
  • 34
  • What is your data and how did you train it? If you passed the same data as input and output, it will definitely learn to copy the input. – Daniel Möller Dec 05 '17 at 16:16
  • @DanielMöller I edited the question. I don't understand what you mean when you say "same data as input and output". – Alessandro Dec 05 '17 at 21:33
  • 1
    It would be helpful to see a small subset of the data you pass as input and output to `model.fit`, or at least the code you use to generate the arrays. Could it be possible that your model is overfitting, either through having insufficient data, `hidden_nodes` being too high (or simply a bug in the training data)? – Phil Dec 06 '17 at 00:03
  • Well, the correct shape should be `(num_samples, window, num_features)`. – Daniel Möller Dec 06 '17 at 11:21
  • Thanks @DanielMöller, I made a mistake in the question. I added some information about data preparation. – Alessandro Dec 06 '17 at 11:33
  • So, you're trying to predict the past instead of the future? It does sound like it should be `model.fit(T,X)`. – Daniel Möller Dec 06 '17 at 11:37
  • Yes, because I'm going to build a system of anomaly detection. – Alessandro Dec 06 '17 at 11:47
  • 1
    LSTMs cannot predict the past, they follow a "sequence". The order matters. They predict the future. You must at least invert the order of your data if you're going to predict the past. – Daniel Möller Dec 06 '17 at 13:20
  • I'm confused, how can I call `model.fit(T, X)` ? Keras wants a numpy array in the form `(num_samples, window, num_features)`. If you need more information let me know, maybe I misunderstood your point. – Alessandro Dec 06 '17 at 13:50
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/160617/discussion-between-ghemon-and-daniel-moller). – Alessandro Dec 06 '17 at 13:57
  • 1
    I'm curious about a loss function with 3 arguments. How are you passing it to the model? – Daniel Möller Dec 11 '17 at 10:30
  • @DanielMöller I do it with functools.partial – Alessandro Dec 11 '17 at 16:56

0 Answers0