LSTM not learning, no MSE

Question

Hi I am having trouble with finding the correct inputshape for my LTSM model. I have been trying to find a shape that fits but have trouble understanding what is required.

I think the problem is in the ytest and ytrain shape. Why is it not the same shape as xtrain and xtest?

xtrain (80304, 37)
xtest (39538, 37)
ytrain (80304,)
ytest (39538,)
Epoch 1/3
2510/2510 [==============================] - 34s 13ms/step - loss: nan
Epoch 2/3
2510/2510 [==============================] - 32s 13ms/step - loss: nan
Epoch 3/3
2510/2510 [==============================] - 33s 13ms/step - loss: nan
Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_10 (LSTM)               (None, 4)                 96        
_________________________________________________________________
dense_9 (Dense)              (None, 1)                 5         
=================================================================
Total params: 101
Trainable params: 101
Non-trainable params: 0

The model isnt training based on MSE:

When I try to fit this model:

print('tf version', tf.version.VERSION)

train_size = int(len(oral_ds) * 0.67)
print(train_size)
test_size = len(oral_ds) - train_size
print(test_size)
train = oral_ds[:train_size]
test = oral_ds[80319:119881]

print(len(train), len(test))

X_train = train.drop(columns=['PRICE','WEEK_END_DATE','Optimized rev','Original rev'])
y_train = train.PRICE

X_test = test.drop(columns=['PRICE','WEEK_END_DATE','Optimized rev','Original rev'])
y_test = test.PRICE

print('xtrain',np.shape(X_train))
print('xtest',np.shape(X_test))
print('ytrain',np.shape(y_train))
print('ytest',np.shape(y_test))


X_train=X_train.values.reshape(X_train.shape[0],X_train.shape[1],1)
#y_train=y_train.values.reshape(y_train.shape[0],y_train.shape[1],1)
X_test=X_test.values.reshape(X_test.shape[0],X_test.shape[1],1)
#y_test=y_test.values.reshape(y_test.shape[0],y_test.shape[1],1)

#print('reshaped xtrain',np.shape(X_train))
#print('reshaped xtest',np.shape(X_test))
#print('reshaped ytrain',np.shape(y_train))
#print('reshaped ytest',np.shape(y_test))


single_step_model = tf.keras.models.Sequential()
single_step_model.add(tf.keras.layers.LSTM(4,
                                            input_shape=(37,1)))
single_step_model.add(tf.keras.layers.Dense(units = 1))
single_step_model.compile(optimizer = 'adam', loss = 'mean_squared_error')



BATCH_SIZE=32
train_data = tf.data.Dataset.from_tensor_slices((X_train, y_train))
train_data = train_data.cache().shuffle(10000).batch(BATCH_SIZE)
valid_data = tf.data.Dataset.from_tensor_slices((X_test, y_test))

history = single_step_model.fit(train_data, epochs=3)
single_step_model.summary()

I have tried to implement solutions from other posts such as:

But neither of these are working.

Anyway any guidance?

Follow this link for your problem. Clearly explained here. https://stackoverflow.com/a/54416792/12598386 — Lakpa Tamang, Jan 14 '21 at 08:50
Thank you Tamang, I have tried this and it doesn't seem to be working. I have updated my code to show the same error message — AnnejetLouise, Jan 14 '21 at 15:42

berkay · Answer 1 · 2021-01-14T18:06:12.047

0

So in general LSTM expects 3 dimensonal inputs:

 (#batch_size, #number_of_features, #timesteps)

for which the feature and timesteps indices change depending on the platform. I guess you have 37 timesteps and 1 feature, so just change your input to:

 (#batch_size,37,1) or (#batch_size,1,37)

So see the following dummy example:

import tensorflow as tf
inputs = tf.random.normal([100, 37, 1])
lstm = tf.keras.layers.LSTM(units =50 ,input_shape=(37,1))
output = lstm(inputs)
>>print(output.shape)
(100, 50)

The code in the following works end-to-end:

X_train = np.random.rand(80319,37,1)
y_train = np.random.randint(0,1,80319)
BATCH_SIZE=32
train_data = tf.data.Dataset.from_tensor_slices((X_train, y_train))
train_data = train_data.cache().shuffle(10000).batch(BATCH_SIZE)

    
single_step_model = tf.keras.models.Sequential()
single_step_model.add(tf.keras.layers.LSTM(4,
                                            input_shape=(37,1)))
single_step_model.add(tf.keras.layers.Dense(units = 1))

single_step_model.compile(optimizer = 'adam', loss = 'mean_squared_error')
single_step_model.fit(train_data, epochs=10)

and if you remove the batch size selection it throws the exactly same error.

edited Jan 14 '21 at 18:06

answered Jan 13 '21 at 10:19

berkay

134
1
12

Thank you Berkay, I have tried this, but I still get an error message. To be clear I have tried: regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (1000, 1,37)) regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (1000, 37,1)) regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (1,37)) regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (37,1)) ```ValueError: Input 0 of layer lstm_13 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 1000, 1, 37] ``` – AnnejetLouise Jan 13 '21 at 19:49
Pleasee add your model summary and tensorflow version – berkay Jan 14 '21 at 17:27
tf version = 2.2, model summary updated in question. Thank you Berkay! – AnnejetLouise Jan 14 '21 at 17:44
For the first two you don't need to specify batchsize ( I assume 1000 is your batch size). For the last two it should work actually, if you are wrangling your input correctly. – berkay Jan 14 '21 at 17:44
Im still getting the same error message. Is it correct Im not supposed to reshape the y data? I am also not sure what you mean with wrangling. Do you mean I am not splitting data correctly? – AnnejetLouise Jan 14 '21 at 17:53
Its working! The mistake was with in using these two lines: ```train_data = tf.data.Dataset.from_tensor_slices((X_train, y_train)) valid_data = tf.data.Dataset.from_tensor_slices((X_test, y_test))``` I removed those and just used ```regressor.fit(X_train, y_train, epochs=10)``` – AnnejetLouise Jan 14 '21 at 18:05
@AnnejetLouise I have added a fully working example and showed what is the problem – berkay Jan 14 '21 at 18:07
You will want to train it with batches so adapt my answer instead of that approach – berkay Jan 14 '21 at 18:07
@AnnejetLouise please close the question by selecting the answer – berkay Jan 14 '21 at 18:33
Sorry @berkay, but the model still is not running properly. I have updated the code above. For some reason it is not calculating the mse loss and therefore the model is not improving or learning. – AnnejetLouise Jan 16 '21 at 17:31
I mean ask another question instead of changing the question – berkay Jan 17 '21 at 11:32

LSTM not learning, no MSE

1 Answers1