LSTM Training Loss and Val Loss not changing

Question

I have been trying to create a LSTM RNN using tensorflow keras in order to predict whether someone is driving or not driving (binary classification) based on just Datetime and lat/long. However, when I train the network, loss and val_loss don't really change much.

I converted lat/long into x,y,z coordinates that are between -1 and 1. I also used Datetime to extract whether it's a weekend or not and what period of the day it is (morning/afternoon/evening).

Here is a sample of the data (formatting is a bit weird):

                 trip_id weekday period_of_day  x     y        z        mode_cat
datetime    id                          
2011-08-27 06:13:01 20  1   0   2         0.650429  0.043524    0.758319    1
2011-08-27 06:13:02 20  1   0   2         0.650418  0.043487    0.758330    1
2011-08-27 06:13:03 20  1   0   2         0.650421  0.043490    0.758328    1
2011-08-27 06:13:04 20  1   0   2         0.650427  0.043506    0.758322    1
2011-08-27 06:13:05 20  1   0   2         0.650438  0.043516    0.758312    1

And here is the code for building the network:

single_step_model = tf.keras.models.Sequential()
single_step_model.add(tf.keras.layers.LSTM(512, return_sequences=True,
                                           input_shape=x_train_single.shape[-2:]))
single_step_model.add(tf.keras.layers.Dropout(0.4))
single_step_model.add(tf.keras.layers.Dense(128, activation='tanh'))
single_step_model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

opt = tf.keras.optimizers.Adam(learning_rate=0.0001)
single_step_model.compile(optimizer=opt, loss='binary_crossentropy',
                          metrics=['accuracy'])

I have tried all kinds of different learning rates, batch sizes, epochs, dropouts, # of hidden layers, # of units and they all run into this problem.

I have also taken a look at my data and noticed that the loss and val_loss are equal to the percentage of training/validation data that is # of driving/total # of rows for that dataset. This means that my network is always predicting the same outcome.

Here is the training and validation loss data per epoch:

Epoch 1/100
1410/1410 [==============================] - 775s 550ms/step - loss: 0.6942 - binary_accuracy: 0.5273 - val_loss: 0.6909 - val_binary_accuracy: 0.5380
Epoch 2/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6911 - binary_accuracy: 0.5352 - val_loss: 0.6904 - val_binary_accuracy: 0.5380
Epoch 3/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6906 - binary_accuracy: 0.5374 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 4/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6905 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 5/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 6/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6904 - val_binary_accuracy: 0.5380
Epoch 7/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 8/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 9/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 10/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 11/100
1410/1410 [==============================] - 775s 550ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 12/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 13/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5377 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 14/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6904 - binary_accuracy: 0.5374 - val_loss: 0.6903 - val_binary_accuracy: 0.5379
Epoch 15/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5377 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 16/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 17/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 18/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 19/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6903 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380

Is this because there isn't enough information in my features/dataset for my network to learn? Or is it a problem with the network itself? What other things can I try? Please advise.

Does this answer your question? [Keras accuracy does not change](https://stackoverflow.com/questions/37213388/keras-accuracy-does-not-change) — geoph9, Jun 02 '20 at 19:22
@geoph9 I gave SGD with momentum a try. It leads to the same result although it takes a longer time to get there. — Anthony Arena, Jun 02 '20 at 21:19

LSTM Training Loss and Val Loss not changing

0 Answers0