I have been trying to create a LSTM RNN using tensorflow keras in order to predict whether someone is driving or not driving (binary classification) based on just Datetime and lat/long. However, when I train the network, loss and val_loss don't really change much.
I converted lat/long into x,y,z coordinates that are between -1 and 1. I also used Datetime to extract whether it's a weekend or not and what period of the day it is (morning/afternoon/evening).
Here is a sample of the data (formatting is a bit weird):
trip_id weekday period_of_day x y z mode_cat
datetime id
2011-08-27 06:13:01 20 1 0 2 0.650429 0.043524 0.758319 1
2011-08-27 06:13:02 20 1 0 2 0.650418 0.043487 0.758330 1
2011-08-27 06:13:03 20 1 0 2 0.650421 0.043490 0.758328 1
2011-08-27 06:13:04 20 1 0 2 0.650427 0.043506 0.758322 1
2011-08-27 06:13:05 20 1 0 2 0.650438 0.043516 0.758312 1
And here is the code for building the network:
single_step_model = tf.keras.models.Sequential()
single_step_model.add(tf.keras.layers.LSTM(512, return_sequences=True,
input_shape=x_train_single.shape[-2:]))
single_step_model.add(tf.keras.layers.Dropout(0.4))
single_step_model.add(tf.keras.layers.Dense(128, activation='tanh'))
single_step_model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
opt = tf.keras.optimizers.Adam(learning_rate=0.0001)
single_step_model.compile(optimizer=opt, loss='binary_crossentropy',
metrics=['accuracy'])
I have tried all kinds of different learning rates, batch sizes, epochs, dropouts, # of hidden layers, # of units and they all run into this problem.
I have also taken a look at my data and noticed that the loss and val_loss are equal to the percentage of training/validation data that is # of driving/total # of rows for that dataset. This means that my network is always predicting the same outcome.
Here is the training and validation loss data per epoch:
Epoch 1/100
1410/1410 [==============================] - 775s 550ms/step - loss: 0.6942 - binary_accuracy: 0.5273 - val_loss: 0.6909 - val_binary_accuracy: 0.5380
Epoch 2/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6911 - binary_accuracy: 0.5352 - val_loss: 0.6904 - val_binary_accuracy: 0.5380
Epoch 3/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6906 - binary_accuracy: 0.5374 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 4/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6905 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 5/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 6/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6904 - val_binary_accuracy: 0.5380
Epoch 7/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 8/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 9/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 10/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 11/100
1410/1410 [==============================] - 775s 550ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 12/100
1410/1410 [==============================] - 775s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 13/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5377 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 14/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6904 - binary_accuracy: 0.5374 - val_loss: 0.6903 - val_binary_accuracy: 0.5379
Epoch 15/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6903 - binary_accuracy: 0.5377 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 16/100
1410/1410 [==============================] - 774s 549ms/step - loss: 0.6904 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 17/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 18/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6903 - binary_accuracy: 0.5375 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Epoch 19/100
1410/1410 [==============================] - 773s 548ms/step - loss: 0.6903 - binary_accuracy: 0.5376 - val_loss: 0.6903 - val_binary_accuracy: 0.5380
Is this because there isn't enough information in my features/dataset for my network to learn? Or is it a problem with the network itself? What other things can I try? Please advise.