Nice catch!
It would seem that the issue linked in the comment above by Dennis Soemers, Keras Dropout layer changes results with dropout=0.0, has not been fully resolved, and it somehow blunders when faced with a dropout rate of 1.0 [see UPDATE at the end of post]; modifying the model shown in the Keras MNIST MLP example:
model = Sequential()
model.add(Dense(512, activation='relu', use_bias=False, input_shape=(784,)))
model.add(Dropout(1.0))
model.add(Dense(512, activation='relu'))
model.add(Dropout(1.0))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=RMSprop(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=128,
epochs=3,
verbose=1,
validation_data=(x_test, y_test))
gives indeed a model being trained, despite all neurons being dropped, as you report:
Train on 60000 samples, validate on 10000 samples
Epoch 1/3
60000/60000 [==============================] - 15s 251us/step - loss: 0.2180 - acc: 0.9324 - val_loss: 0.1072 - val_acc: 0.9654
Epoch 2/3
60000/60000 [==============================] - 15s 246us/step - loss: 0.0831 - acc: 0.9743 - val_loss: 0.0719 - val_acc: 0.9788
Epoch 3/3
60000/60000 [==============================] - 15s 245us/step - loss: 0.0526 - acc: 0.9837 - val_loss: 0.0997 - val_acc: 0.9723
Nevertheless, if you try a dropout rate of 0.99, i.e. replacing the two dropout layers in the above model with
model.add(Dropout(0.99))
then indeed you have effectively no training taking place, as it should be the case:
Train on 60000 samples, validate on 10000 samples
Epoch 1/3
60000/60000 [==============================] - 16s 265us/step - loss: 3.4344 - acc: 0.1064 - val_loss: 2.3008 - val_acc: 0.1136
Epoch 2/3
60000/60000 [==============================] - 16s 261us/step - loss: 2.3342 - acc: 0.1112 - val_loss: 2.3010 - val_acc: 0.1135
Epoch 3/3
60000/60000 [==============================] - 16s 266us/step - loss: 2.3167 - acc: 0.1122 - val_loss: 2.3010 - val_acc: 0.1135
UPDATE (after comment by Yu-Yang in OP): It seems as a design choice (deal link now, see update below) not to do anything when the dropout rate is equal to either 0 or 1; the Dropout
class becomes effective only
if 0. < self.rate < 1.
Nevertheless, as already commented, a warning message in such cases (and a relevant note in the documentation) would arguably be a good idea.
UPDATE (July 2021):
There have been some changes since Jan 2018 when the answer was written; now, under the hood, Keras calls tf.nn.dropout
, which does not seem to allow for dropout=1
(source).