10

I have a SimpleRNN like:

model.add(SimpleRNN(10, input_shape=(3, 1)))
model.add(Dense(1, activation="linear"))

The model summary says:

simple_rnn_1 (SimpleRNN)   (None, 10)   120       

I am curious about the parameter number 120 for simple_rnn_1.

Could you someone answer my question?

MBT
  • 21,733
  • 19
  • 84
  • 102
youngtackpark
  • 1,475
  • 3
  • 12
  • 14

3 Answers3

11

When you look at the headline of the table you see the title Param:

Layer (type)              Output Shape   Param 
===============================================
simple_rnn_1 (SimpleRNN)   (None, 10)    120   

This number represents the number of trainable parameters (weights and biases) in the respective layer, in this case your SimpleRNN.

Edit:

The formula for calculating the weights is as follows:

recurrent_weights + input_weights + biases

*resp: (num_features + num_units)* num_units + num_units

Explanation:

num_units = equals the number of units in the RNN

num_features = equals the number features of your input

Now you have two things happening in your RNN.

First you have the recurrent loop, where the state is fed recurrently into the model to generate the next step. Weights for the recurrent step are:

recurrent_weights = num_units*num_units

The secondly you have new input of your sequence at each step.

input_weights = num_features*num_units

(Usually both last RNN state and new input are concatenated and then multiplied with one single weight matrix, nevertheless inputs and last RNN state use different weights)

So now we have the weights, whats missing are the biases - for every unit one bias:

biases = num_units*1

So finally we have the formula:

recurrent_weights + input_weights + biases

or

num_units* num_units + num_features* num_units + biases

=

(num_features + num_units)* num_units + biases

In your cases this means the trainable parameters are:

10*10 + 1*10 + 10 = 120

I hope this is understandable, if not just tell me - so I can edit it to make it more clear.

MBT
  • 21,733
  • 19
  • 84
  • 102
  • My question was how the number 120 comes. Thanks – youngtackpark May 02 '18 at 12:33
  • HI, I have one more question. Is it many-to-one type? I mean we use model.add(Dense(1)) after simpleRNN(10, input_shape=(3,1)). Does this means that after look at the third data, the model generate one output using Dense(1) from 10 outputs of the simpleRNN? – youngtackpark May 03 '18 at 01:20
  • 1
    This is bit off-topic, but yes, it's many to one. But 10 units does not mean it produces a sequence of length 10, this is just the dimensionality - the higher the more information you can store, but it's still only one output. If you want to use many to many you need to set `return_sequences=True` in the RNN. The number of `Dense` units in the last layer is usually considered as number of classes and not outputs of a sequence. If you want to know more about `return_sequences` here is a blog post about it: https://machinelearningmastery.com/return-sequences-and-return-states-for-lstms-in-keras/ – MBT May 03 '18 at 07:44
  • @MBT Can you please explain more. I could not understand. – Naveen Gabriel Dec 19 '19 at 16:15
  • Both recurrent term and input term have independent biases repectively, so there are two biases in total. Hence, the number of parameters is 10*10 + 1*10 + 10 +10 = 130. Am I right ? – BioCoder Jul 14 '20 at 12:18
7

It might be easier to understand visually with a simple network like this:

enter image description here

The number of weights is 16 (4 * 4) + 12 (3 * 4) = 28 and the number of biases is 4.

where 4 is the number of units and 3 is the number of input dimensions, so the formula is just like in the first answer: num_units ^ 2 + num_units * input_dim + num_units or simply num_units * (num_units + input_dim + 1), which yields 10 * (10 + 1 + 1) = 120 for the parameters given in the question.

tromgy
  • 4,937
  • 3
  • 17
  • 18
  • 2
    i do not understand this graphical representation . – Naveen Gabriel Dec 19 '19 at 16:09
  • You might also look at a similar answer here: https://stackoverflow.com/questions/38080035/how-to-calculate-the-number-of-parameters-of-an-lstm-network/56614978#56614978, hopefully it will make it easier to understand. – tromgy Dec 20 '19 at 17:51
0

I visualize the SimpleRNN you add, I think the figure can explain a lot.

SimpleRNN layer, I'm a newbie here, can't post images directly, so you need to click the link.

From the unrolled version of SimpleRNN layer,it can be seen as a dense layer. And the previous layer is a concatenation of input and the current layer(previous step) itself.

So the number of parameters of SimpleRNN can be computed as a dense layer:

num_para = units_pre * units + num_bias

where:

units_pre is the sum of input neurons(1 in your settings) and units(see below),

units is the number of neurons(10 in your settings) in the current layer,

num_bias is the number of bias term in the current layer, which is the same as the units.

Plugging in your settings, we achieve the num_para = (1 + 10) * 10 + 10 = 120.

chendong
  • 1
  • 2