1

I'm trying to fit a simple LSTM network using Keras and TensorFlow. I keep getting the below error, or variations of it when attempting to fit. I've tried a batch size of 89 and 1 and both return the same error, differing only in the shapes displayed having conflict.

The shape of my Xtrain is:

In [9]: print(Xtrain.shape)
(6853, 89, 250)

In [10]: print(ytrain.shape)
(6853, 89, 1)

    In [11]: model = Sequential()^M
   ...: model.add(LSTM(units=89,^M
   ...:     batch_input_shape=(Xtrain.shape[0],timesteps,len(features)),^M
   ...:     activation='relu',^M
   ...:     return_sequences=True,^M
   ...:     stateful=True))^M
   ...:     ^M
   ...: model.add(Dropout(0.2))^M
   ...: ^M
   ...: model.add(Dense(21))^M
   ...:     ^M
   ...: model.add(Dropout(0.2))^M
   ...: ^M
   ...: model.add(TimeDistributed(Dense(1)))^M
   ...: #model.add(Dense(1))^M
   ...: model.compile(loss='mean_squared_error', optimizer=opt, metrics=['mae'])^M
   ...: model.summary()^M
   ...: model.fit(Xtrain,ytrain,batch_size=89,verbose=1,shuffle=False,epochs=1)
   ...:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
lstm_5 (LSTM)                (6853, 89, 89)            121040
_________________________________________________________________
dropout_9 (Dropout)          (6853, 89, 89)            0
_________________________________________________________________
dense_9 (Dense)              (6853, 89, 21)            1890
_________________________________________________________________
dropout_10 (Dropout)         (6853, 89, 21)            0
_________________________________________________________________
time_distributed_5 (TimeDist (6853, 89, 1)             22
=================================================================
Total params: 122,952
Trainable params: 122,952
Non-trainable params: 0
_________________________________________________________________
Epoch 1/1
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-9-9ecc3d11a1f7> in <module>()
     16 model.compile(loss='mean_squared_error', optimizer=opt, metrics=['mae'])
     17 model.summary()
---> 18 model.fit(Xtrain,ytrain,batch_size=89,verbose=1,shuffle=False,epochs=1)

C:\Users\asus\Anaconda3\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_dat
   1035                                         initial_epoch=initial_epoch,
   1036                                         steps_per_epoch=steps_per_epoch,
-> 1037                                         validation_steps=validation_steps)
   1038
   1039     def evaluate(self, x=None, y=None,

C:\Users\asus\Anaconda3\lib\site-packages\keras\engine\training_arrays.py in fit_loop(model, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f
    197                     ins_batch[i] = ins_batch[i].toarray()
    198
--> 199                 outs = f(ins_batch)
    200                 outs = to_list(outs)
    201                 for l, o in zip(out_labels, outs):

C:\Users\asus\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py in __call__(self, inputs)
   2664                 return self._legacy_call(inputs)
   2665
-> 2666             return self._call(inputs)
   2667         else:
   2668             if py_any(is_tensor(x) for x in inputs):

C:\Users\asus\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py in _call(self, inputs)
   2634                                 symbol_vals,
   2635                                 session)
-> 2636         fetched = self._callable_fn(*array_vals)
   2637         return fetched[:len(self.outputs)]
   2638

C:\Users\asus\Anaconda3\lib\site-packages\tensorflow\python\client\session.py in __call__(self, *args)
   1452         else:
   1453           return tf_session.TF_DeprecatedSessionRunCallable(
-> 1454               self._session._session, self._handle, args, status, None)
   1455
   1456     def __del__(self):

C:\Users\asus\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    517             None, None,
    518             compat.as_text(c_api.TF_Message(self.status.status)),
--> 519             c_api.TF_GetCode(self.status.status))
    520     # Delete the underlying status object from memory otherwise it stays alive
    521     # as there is a reference to status from this from the traceback due to

InvalidArgumentError: Incompatible shapes: [6853] vs. [89]
         [[Node: training_4/Adam/gradients/loss_4/time_distributed_5_loss/mul_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@tra

I've reviewed a handful of other questions on github and SO with similar errors, but all have to do either with custom layers or parallel processing on multiple GPUs, neither of which apply in my case.

The only helpful error message I've received is that the batch_size must be

What am I doing wrong here?

Jed
  • 638
  • 1
  • 8
  • 17
  • You shouldn't need to enter the number of training examples in the `batch_input_shape` argument for LSTM. Generally, you are setting a lot of params to 89 which is confusing. Reading this might help you; https://stackoverflow.com/questions/44747343/keras-input-explanation-input-shape-units-batch-size-dim-etc – ame Jul 29 '18 at 09:44
  • 1
    Apologies ignore my last comment, I was thinking about `input_shape`. I do still have the feeling that `batch_input_shape` is the cause of the error; having read some more I think the first value should be equal to the batch size specified in `model.fit`. – ame Jul 29 '18 at 09:59
  • @ame thanks! I tried that and it does work. It looks that is the issue. – Jed Jul 29 '18 at 10:00

0 Answers0