I have run a Keras LSTM demo containing the following code (after line 166):
m = 1
model=Sequential()
dim_in = m
dim_out = m
nb_units = 10
model.add(LSTM(input_shape=(None, dim_in),
return_sequences=True,
units=nb_units))
model.add(TimeDistributed(Dense(activation='linear', units=dim_out)))
model.compile(loss = 'mse', optimizer = 'rmsprop')
When I prepend a call to model.summary()
, I see the following output:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
lstm_4 (LSTM) (None, None, 10) 480
_________________________________________________________________
time_distributed_4 (TimeDist (None, None, 1) 11
=================================================================
Total params: 491
Trainable params: 491
Non-trainable params: 0
I understand that the 11 params of the time distributed layer simply consist of nb_units
weights plus one bias value.
Now for the LSTM layer: These answers say:
params = 4 * ((input_size + 1) * output_size + output_size^2)
In my case with input_size = 1
and output_size = 1
this yields only 12 parameters for each of the 10 units, totaling to 120 parameters. Compared to the reported 480, this is off by a factor of 4. Where is my error?