Tensorflow: loss and accuracy stay flat training CNN on image classification

Question

I copied / pasted this Tensorflow tutorial into a Jupyter notebook. (As of this writting they changed the tutorial to the flower data set instead of the dog one, but the question still applies). https://www.tensorflow.org/tutorials/images/classification

The first part (without augmentation) runs fine and I get similar results.

But with data augmentation, my Loss and Accuracy stay flat across all epoch. I've checked this posts already on SO : Keras accuracy does not change How to fix flatlined accuracy and NaN loss in tensorflow image classification Tensorflow: loss decreasing, but accuracy stable

None of this applied, since the dataset is a standard one, I don't have the problem of corrupted data, plus I printed a couple of images augmented and it works fine (see below).

I've tried adding more fully connected layers to increase the model capacity, dropout to limit over fitting,... nothing change here are the curve :

Any ideas as to why? Have I missed something in the code? I know training a DL model is a lot of trial and error, but I'm sure there must be some logic or intuition beyond randomly turning the knobs until something happens.

Thanks !

Source Data : _URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'

path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)

PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')

Params :

batch_size = 128
epochs = 15
IMG_HEIGHT = 150
IMG_WIDTH = 150

Preprocessing stage :

image_gen = ImageDataGenerator(rescale=1./255,
    rotation_range=20,
    width_shift_range=0.15,
    height_shift_range=0.15,
    horizontal_flip=True,
    zoom_range=0.2)

train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
                                               directory=train_dir,
                                               shuffle=True,
                                               target_size=(IMG_HEIGHT, IMG_WIDTH))

augmented_images = [train_data_gen[0][0][i] for i in range(5)]
plotImages(augmented_images)

image_gen_val = ImageDataGenerator(rescale=1./255)

val_data_gen = image_gen_val.flow_from_directory(batch_size=batch_size,
                                                 directory=validation_dir,
                                                 target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                 class_mode='binary')

Model :

model_new = Sequential([
    Conv2D(16, 2, padding='same', activation='relu', 
           input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
    MaxPooling2D(),
    Conv2D(32, 2, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 2, padding='same', activation='relu'),
    MaxPooling2D(),
    Dropout(0.2),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(1)
])

model_new.compile(optimizer='adam',
                  loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
                  metrics=['accuracy'])

model_new.summary()

history = model_new.fit(
    train_data_gen,
    steps_per_epoch= total_train // batch_size,
    epochs=epochs,
    validation_data=val_data_gen,
    validation_steps= total_val // batch_size
)

Try passing `class_mode='binary'` argument to `flow_from_directory` method. — today, Aug 03 '20 at 20:38
@today you nailed it. I had it in the validation generator but not for the train data. May I ask why this is causing these behavior though ? From TF doc it seems that omitting 'class_mode' would default to 'categorical', returning a 2D Tensor. Shouldn't that be incompatible with the dimensions of the last layer Dense(2) ? — Yoan B. M.Sc, Aug 04 '20 at 12:35
@YoanB. Well, introducing eager mode in TF 2.x has brought a lot of weird issues or inconsistent behavior that I am personally getting tired of it. For example, if you replace the generator with a numpy array (with labels of shape `(2,)`) it would raise an error. Further, even with your current generators (i.e. without fixing `class_mode`), if you disable the eager mode (i.e. `tf.compat.v1.disable_eager_execution()`) and use graph mode, you would encounter an error complaining about the incompatible output shape. So I am not sure what to say. You can report and ask about this on TF Github. — today, Aug 04 '20 at 13:14
@today, thanks. It sure make it harder to learn the material but at least I know that TF 2.x have some inconsistencies I should watch out for. — Yoan B. M.Sc, Aug 04 '20 at 13:34

score 0 · Answer 1 · answered Aug 04 '20 at 12:41

As suggested by @today, class_method= 'binary' was missing from the training data generator Now the model is able to train properly.

train_data_gen = image_gen.flow_from_directory(batch_size=batch_size,
                                               directory=train_dir,
                                               shuffle=True,
                                               target_size=(IMG_HEIGHT, IMG_WIDTH),
                                               class_method = 'binary')

Tensorflow: loss and accuracy stay flat training CNN on image classification

1 Answers1