I'm trying to fit a simple LSTM network using Keras and TensorFlow. I keep getting the below error, or variations of it when attempting to fit. I've tried a batch size of 89 and 1 and both return the same error, differing only in the shapes displayed having conflict.
The shape of my Xtrain is:
In [9]: print(Xtrain.shape)
(6853, 89, 250)
In [10]: print(ytrain.shape)
(6853, 89, 1)
In [11]: model = Sequential()^M
...: model.add(LSTM(units=89,^M
...: batch_input_shape=(Xtrain.shape[0],timesteps,len(features)),^M
...: activation='relu',^M
...: return_sequences=True,^M
...: stateful=True))^M
...: ^M
...: model.add(Dropout(0.2))^M
...: ^M
...: model.add(Dense(21))^M
...: ^M
...: model.add(Dropout(0.2))^M
...: ^M
...: model.add(TimeDistributed(Dense(1)))^M
...: #model.add(Dense(1))^M
...: model.compile(loss='mean_squared_error', optimizer=opt, metrics=['mae'])^M
...: model.summary()^M
...: model.fit(Xtrain,ytrain,batch_size=89,verbose=1,shuffle=False,epochs=1)
...:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_5 (LSTM) (6853, 89, 89) 121040
_________________________________________________________________
dropout_9 (Dropout) (6853, 89, 89) 0
_________________________________________________________________
dense_9 (Dense) (6853, 89, 21) 1890
_________________________________________________________________
dropout_10 (Dropout) (6853, 89, 21) 0
_________________________________________________________________
time_distributed_5 (TimeDist (6853, 89, 1) 22
=================================================================
Total params: 122,952
Trainable params: 122,952
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
<ipython-input-9-9ecc3d11a1f7> in <module>()
16 model.compile(loss='mean_squared_error', optimizer=opt, metrics=['mae'])
17 model.summary()
---> 18 model.fit(Xtrain,ytrain,batch_size=89,verbose=1,shuffle=False,epochs=1)
C:\Users\asus\Anaconda3\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_dat
1035 initial_epoch=initial_epoch,
1036 steps_per_epoch=steps_per_epoch,
-> 1037 validation_steps=validation_steps)
1038
1039 def evaluate(self, x=None, y=None,
C:\Users\asus\Anaconda3\lib\site-packages\keras\engine\training_arrays.py in fit_loop(model, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f
197 ins_batch[i] = ins_batch[i].toarray()
198
--> 199 outs = f(ins_batch)
200 outs = to_list(outs)
201 for l, o in zip(out_labels, outs):
C:\Users\asus\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py in __call__(self, inputs)
2664 return self._legacy_call(inputs)
2665
-> 2666 return self._call(inputs)
2667 else:
2668 if py_any(is_tensor(x) for x in inputs):
C:\Users\asus\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py in _call(self, inputs)
2634 symbol_vals,
2635 session)
-> 2636 fetched = self._callable_fn(*array_vals)
2637 return fetched[:len(self.outputs)]
2638
C:\Users\asus\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in __call__(self, *args)
1452 else:
1453 return tf_session.TF_DeprecatedSessionRunCallable(
-> 1454 self._session._session, self._handle, args, status, None)
1455
1456 def __del__(self):
C:\Users\asus\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
517 None, None,
518 compat.as_text(c_api.TF_Message(self.status.status)),
--> 519 c_api.TF_GetCode(self.status.status))
520 # Delete the underlying status object from memory otherwise it stays alive
521 # as there is a reference to status from this from the traceback due to
InvalidArgumentError: Incompatible shapes: [6853] vs. [89]
[[Node: training_4/Adam/gradients/loss_4/time_distributed_5_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@tra
I've reviewed a handful of other questions on github and SO with similar errors, but all have to do either with custom layers or parallel processing on multiple GPUs, neither of which apply in my case.
The only helpful error message I've received is that the batch_size must be
What am I doing wrong here?