22

When creating a Sequential model in Keras, I understand you provide the input shape in the first layer. Does this input shape then make an implicit input layer?

For example, the model below explicitly specifies 2 Dense layers, but is this actually a model with 3 layers consisting of one input layer implied by the input shape, one hidden dense layer with 32 neurons, and then one output layer with 10 possible outputs?

model = Sequential([
    Dense(32, input_shape=(784,)),
    Activation('relu'),
    Dense(10),
    Activation('softmax'),
])
desertnaut
  • 57,590
  • 26
  • 140
  • 166
blackHoleDetector
  • 2,975
  • 2
  • 13
  • 13

2 Answers2

30

Well, it actually is an implicit input layer indeed, i.e. your model is an example of a "good old" neural net with three layers - input, hidden, and output. This is more explicitly visible in the Keras Functional API (check the example in the docs), in which your model would be written as:

inputs = Input(shape=(784,))                 # input layer
x = Dense(32, activation='relu')(inputs)     # hidden layer
outputs = Dense(10, activation='softmax')(x) # output layer

model = Model(inputs, outputs)

Actually, this implicit input layer is the reason why you have to include an input_shape argument only in the first (explicit) layer of the model in the Sequential API - in subsequent layers, the input shape is inferred from the output of the previous ones (see the comments in the source code of core.py).

You may also find the documentation on tf.contrib.keras.layers.Input enlightening.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
2

It depends on your perspective :-)

Rewriting your code in line with more recent Keras tutorial examples, you would probably use:

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=784))
model.add(Dense(10, activation='softmax')

...which makes it much more explicit that you only have 2 Keras layers. And this is exactly what you do have (in Keras, at least) because the "input layer" is not really a (Keras) layer at all: it's only a place to store a tensor, so it may as well be a tensor itself.

Each Keras layer is a transformation that outputs a tensor, possibly of a different size/shape to the input. So while there are 3 identifiable tensors here (input, outputs of the two layers), there are only 2 transformations involved corresponding to the 2 Keras layers.

On the other hand, graphically, you might represent this network with 3 (graphical) layers of nodes, and two sets of lines connecting the layers of nodes. Graphically, it's a 3-layer network. But "layers" in this graphical notation are bunches of circles that sit on a page doing nothing, whereas a layers in Keras transform tensors and do actual work for you. Personally, I would get used to the Keras perspective :-)

Note finally that for fun and/or simplicity, I substituted input_dim=784 for input_shape=(784,) to avoid the syntax that Python uses to both confuse newcomers and create a 1-D tuple: (<value>,).

omatai
  • 3,448
  • 5
  • 47
  • 74
  • 1
    OP's code has actually nothing to do with recency of Keras version or tutorials; it is a distinct coding style that calls for placing the activations separately, which some people (not me) still prefer. What you have written is (and was) indeed the "standard" way of coding a 3-layer model, but arguably does not make the actual number of layers clearer than OP's formulation (as the Functional API does). – desertnaut Jun 10 '20 at 16:09
  • @desertnaut I think the value in this answer, and in your comment on it, is to show that multiple perspectives are possible, and that there is actually no definitive correct answer here. Arguably, to claim this network has any specific number of layers is to caricature it rather than characterise it. Your comment highlights a further aspect of that character. – omatai Jun 10 '20 at 21:42