0

I am learning about neural networks in keras. I specified a simple model on made up data.

model=tf.keras.models.Sequential()
model.add(tf.keras.layers.Dense(2, input_dim=2))
model.compile(optimizer='sgd', loss='mean_squared_error')

I have two attributes to predict two values.

Here is where I initialize my data:

x=[]
y=[]
for x1 in range (6):
    x2=int(random.random()*10)
    x.append([x1,x2])
    y.append([2*x1+x2**2-2, x1*x2])
xs = np.array(x, dtype=float)
xs=xs.reshape(6,2)
ys = np.array(y, dtype=float)
ys=ys.reshape(6,2)
model.fit(xs, ys, epochs=500)

Mind you, I use the data solely for the purpose of learning. After I attempted to observe the model. I run model.summary() and model.get_weights().

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 2)                 6         
=================================================================
Total params: 6
Trainable params: 6
Non-trainable params: 0
_________________________________________________________________
None
model weights  [array([[0.5137405, 5.477211 ],
       [8.750836 , 1.6910588]], dtype=float32), array([-5.701193, -7.874653], dtype=float32)]

I don't understand why are there 6 params and six weights. From my understanding there should be two going from each input, or should I have somewhere specifically defined the output layer?

Borut Flis
  • 15,715
  • 30
  • 92
  • 119
  • Aren't these values just the [biases](https://stackoverflow.com/questions/42053170/how-can-i-get-biases-from-a-trained-model-in-keras)? – moosehead42 Jul 15 '20 at 11:24
  • Does this answer your question? [How to calculate the number of parameters for convolutional neural network?](https://stackoverflow.com/questions/42786717/how-to-calculate-the-number-of-parameters-for-convolutional-neural-network) – Narendra Prasath Jul 15 '20 at 11:28

2 Answers2

1

The model architecture you have defined is pictorially shown below

enter image description here

You have one dense layer with two neurons. Why two neurons ? because the first parameter to Dense is units which denotes the number of neurons. Each neuron does linear operation of X.W + b and then applies activation function over it. The learnable parameters in a nuerons are W and b.

Since the size of X is 2 (2 features) so size of W(=2) +b = 3. So each neuron in this case will have 3 parameters and 2 such will have 6 parameters.

mujjiga
  • 16,186
  • 2
  • 33
  • 51
1

You have a single output layer with two neurons, each of these neurons must have two weights (since the inputs are of dimension 2) and another weight called "bias". So each neuron has 3 weights.

In summary, you have 2 neurons and each one has 3 weights or trainable parameters, so in total there are 6 trainable parameters in your network.

Kloster Matias
  • 423
  • 4
  • 9