Keras: Performance of model.fit() when shuffle=False or True

Question

In Keras, when we are training a model for a fixed number of epochs using model.fit(), one of its parameters is shuffle (a boolean). The Keras documentation about it reads:

"Boolean (whether to shuffle the training data before each epoch)."

Essentially, I am training a Convolutional Neural Network and trying to get reproducible results. So, I followed the instructions and specified seeds as mentioned in this answer.

Although it worked partially (successfully got reproducible results on my local machine only), it was thought setting shuffle=False would help (by keeping the same data inputs), but keeping the reproducibility aside for a second, doing that dramatically reduced the performance of the model. Specifically, after each epoch, the metrics give same results (meaning not improving) even an increase in epochs gives same numbers (Accuracy = ~75 after 3 epochs and after 30 epochs). But setting shuffle=True shows gradual normal improvement in results.

Training data shape: (143256, 1, 150, 3)
Target data shape: (143256, 3)
Batch Size: 64

metrics = ['accuracy']

model.compile(loss=keras.losses.categorical_crossentropy,
                      optimizer=keras.optimizers.Adam(),
                      metrics=metrics)
....

model.fit(x_train, to_categorical(y_train), batch_size=batch_size,
          epochs=epochs, verbose=verbose,
          validation_data=(x_val, to_categorical(y_val)),
          shuffle=False, callbacks=[metrics],
          class_weight=class_weights)

Is this normal behavior of shuffling being set to false? Because even though the data is not permuted, the weights should be updated in each epoch and hence the metrics should improve overtime.
Assuming there is some issue with my implementation, should there be any significant difference in model performance when trying to train with both approaches (shuffling or without it)?
How can the results be reproducible with shuffle=True, which they apparently are, even if seeds are specified?

Any help will be really appreciated. Thanks!

You wrote "Specifically, after each epoch, the metrics give same results (meaning not improving)" result on traning or validation set? Please provide code. — Geeocode, Dec 30 '19 at 08:13
I've edited the post to add the portion of code.. What's weird is, everything works as expected when `shuffle=True`. And yes, the metrics are calculated on validation set. — Waqas Kayani, Dec 30 '19 at 09:22
How is your data? Is it grouped (same data together)? If so, similar data will have similar error and the gradients during that data will optimize the weights for those certain minibatches, and will change for the next group. This will also reflect on validation. By shuffeling, you are making the network generalize better, so the outputs will gradually be optimized and more general. — SajanGohil, Dec 31 '19 at 07:07
@SajanGohil No, it is not grouped.. But there is a huge class imbalance (3 classes with ratio ~0.77, 0.18,0.05). Can that be the reason..? — Waqas Kayani, Dec 31 '19 at 07:15

Keras: Performance of model.fit() when shuffle=False or True

0 Answers0