2
import numpy as np
    from keras.models import Sequential
    from keras.layers.core import Dense, Activation

    # X has shape (num_rows, num_cols), where the training data are stored
    # as row vectors
    X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=np.float32)

    # y must have an output vector for each input vector
    y = np.array([[0], [0], [0], [1]], dtype=np.float32)

    # Create the Sequential model
    model = Sequential()

    # 1st Layer - Add an input layer of 32 nodes with the same input shape as
    # the training samples in X
    model.add(Dense(32, input_dim=X.shape[1]))

    # Add a softmax activation layer
    model.add(Activation('softmax'))

    # 2nd Layer - Add a fully connected output layer
    model.add(Dense(1))

    # Add a sigmoid activation layer
    model.add(Activation('sigmoid'))

I am new to Keras and am trying to understand it.

model.add(Dense(32, input_dim=X.shape[1])) The 32 means for each training instance, there are 32 input variable, whose dimension is given by input_dim. But in the input X vector,

array([[0., 0.],
       [0., 1.],
       [1., 0.],
       [1., 1.]], dtype=float32)

There are 4 training instances. It looks like for each example, there are only two input variables. So how does this correspond to the '32' in the Dense layer definition? How does this network look like?

nick
  • 1,090
  • 1
  • 11
  • 24
user697911
  • 10,043
  • 25
  • 95
  • 169

2 Answers2

6

If you try

model.summary()

you will get the answer to your last question.

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 32)                96        
_________________________________________________________________
activation_1 (Activation)    (None, 32)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 33        
_________________________________________________________________
activation_2 (Activation)    (None, 1)                 0         
=================================================================
Total params: 129
Trainable params: 129
Non-trainable params: 0
_________________________________________________________________

Network input are 2 nodes(variables) which are connected with dense_1 layer (32 nodes). In total 32*2 weights + 32 biases gives you 96 parameters. Hope this helps.

Benjamin
  • 165
  • 1
  • 7
  • So in Keras, the 'first' layer is the first hidden layer (32 nodes), not the input layer (2 nodes). Usually when talking about the first layer, it refers to the input layer. Right? – user697911 Aug 18 '18 at 21:16
  • Yes, first layer is just input layer without parameters as you can see with model.summary(). This could also help https://stackoverflow.com/questions/46572674/keras-sequential-model-input-layer. – Benjamin Aug 18 '18 at 21:22
  • " Add an input layer of 32 nodes with the same input shape as", so this note was very misleading, due to the usage of 'input layer'. – user697911 Aug 19 '18 at 05:41
1

Following Benjamin's answer. Here is an example:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Input (Dense)                (None, 16)                32        
_________________________________________________________________
Hidden_1 (Dense)             (None, 16)                272       
_________________________________________________________________
Output (Dense)               (None, 1)                 17        
=================================================================
Total params: 321
Trainable params: 321
Non-trainable params: 0
_________________________________________________________________

To calculate the number of parameters of each layer:

Input Size = (1,) one input

Input layer number of parameters  = 16 weights * 1(input) + 16 biases = 32
Hidden layer number of parameters = 16 weights * 16(hidden neurons) + 16 biases = 272
Output layer number of parameters = 16 weights * 1(output neuron) + 1 bias = 17

enter image description here

wbadry
  • 797
  • 16
  • 27