Just for the sake of the argument I am using the same data during training for training and validation, like this:
model.fit_generator(
generator=train_generator,
epochs=EPOCHS,
steps_per_epoch=train_generator.n // BATCH_SIZE,
validation_data=train_generator,
validation_steps=train_generator.n // BATCH_SIZE
)
So I would expect that the loss and the accuracy of training and validation at the end of each epoch would be pretty much the same? Still it looks like this:
Epoch 1/150
26/26 [==============================] - 55s 2s/step - loss: 1.5520 - acc: 0.3171 - val_loss: 1.6646 - val_acc: 0.2796
Epoch 2/150
26/26 [==============================] - 46s 2s/step - loss: 1.2924 - acc: 0.4996 - val_loss: 1.5895 - val_acc: 0.3508
Epoch 3/150
26/26 [==============================] - 46s 2s/step - loss: 1.1624 - acc: 0.5873 - val_loss: 1.6197 - val_acc: 0.3262
Epoch 4/150
26/26 [==============================] - 46s 2s/step - loss: 1.0601 - acc: 0.6265 - val_loss: 1.9420 - val_acc: 0.3150
Epoch 5/150
26/26 [==============================] - 46s 2s/step - loss: 0.9790 - acc: 0.6640 - val_loss: 1.9667 - val_acc: 0.2823
Epoch 6/150
26/26 [==============================] - 46s 2s/step - loss: 0.9191 - acc: 0.6951 - val_loss: 1.8594 - val_acc: 0.3342
Epoch 7/150
26/26 [==============================] - 46s 2s/step - loss: 0.8811 - acc: 0.7087 - val_loss: 2.3223 - val_acc: 0.2869
Epoch 8/150
26/26 [==============================] - 46s 2s/step - loss: 0.8148 - acc: 0.7379 - val_loss: 1.9683 - val_acc: 0.3358
Epoch 9/150
26/26 [==============================] - 46s 2s/step - loss: 0.8068 - acc: 0.7307 - val_loss: 2.1053 - val_acc: 0.3312
Why does especially the accuracy differ so much although its from the same data source? Is there something about the way how this is calculated that I am missing?
The generator is created like this:
train_images = keras.preprocessing.image.ImageDataGenerator(
rescale=1./255
)
train_generator = train_images.flow_from_directory(
directory="data/superheros/images/train",
target_size=(299, 299),
batch_size=BATCH_SIZE,
shuffle=True
)
Yes, it shuffles the images, but as it iterates over all images also for validation, shouldn't the accuracy at least be close?
So the model looks like this:
inceptionV3 = keras.applications.inception_v3.InceptionV3(include_top=False)
features = inceptionV3.output
net = keras.layers.GlobalAveragePooling2D()(features)
predictions = keras.layers.Dense(units=2, activation="softmax")(net)
for layer in inceptionV3.layers:
layer.trainable = False
model = keras.Model(inputs=inceptionV3.input, outputs=predictions)
optimizer = keras.optimizers.RMSprop()
model.compile(
optimizer=optimizer,
loss="categorical_crossentropy",
metrics=['accuracy']
)
So no dropout or anything, just the inceptionv3 with a softmax layer on top. I would expect that the accuracy differs a bit, but not in this magnitude.