0

How do I go about writing code to visualize the progress of my accuracy and loss development over training when using cross validation? Normally I would assign the variable name 'history' to the fit function when training the model, but in the case of cross validation it does not display the validation curves. I assume this is the case because I am not calling validation_data within the fit function (below).

kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=seed)
cvscores = []
for train, test in kfold.split(x_train, y_train):
   model = Sequential()
   model.add(layers.Conv2D(32,(4,4), activation = 'relu', input_shape = (224,224,3)))
   model.add(layers.MaxPooling2D((2, 2)))
   model.add(layers.Conv2D(64, (4,4), activation = 'relu'))
   model.add(layers.MaxPooling2D((2, 2)))
   model.add(layers.Flatten())
   model.add(layers.Dense(64, activation = 'relu'))
   model.add(layers.Dropout(0.5))
   model.add(layers.Dense(1, activation = 'sigmoid'))
  
   model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
 
   history = model.fit(x_train[train], y_train[train], epochs=15, batch_size=64)
 
   scores = model.evaluate(x_train[test], y_train[test], verbose=0)
   print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
   cvscores.append(scores[1] * 100)
print("%.2f%% (+/- %.2f%%)" % (np.mean(cvscores), np.std(cvscores)))

Normally I would use code such as below, but since I do not have the validation data within fit, I am not sure how to approach it.

plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

]1]1

Janne
  • 47
  • 5

1 Answers1

0

You can dump everything using TensorBoard. Generally, you make following splits: train, test, validation. You validate your model using that 3rd split. You can use your metrics from sklearn. Most of the time people don't cross-validate their DNN models, as it'd take too much time. However, once you have those models, it'd be nice to plot distribution of metrics using some boxplot.

Piotr Rarus
  • 884
  • 8
  • 16
  • Thanks, after lots of mess with versions I managed to get TensorBoard to work. Unfortunately also here I cannot visualize the validation accuracy and loss - i'd like to see whether I am overfitting or not in a graph. Do you know of a way to visualize these graphs per fold, or is this something that isn't normally done? – Janne Nov 20 '19 at 08:32
  • Guess you've already checked the [docs](https://www.tensorflow.org/tensorboard/get_started). When you fire up tensorboard you input only dir to log files. App is loading every log file in the folder. You could output one log per cross-validation split. In each split you instantiate new model, so you'll have separate tf sessions anyway. You could also create one more dummy sessions, where you could aggregate all plots to get that stacked Joy Division like plot. – Piotr Rarus Nov 20 '19 at 08:45
  • Check [this](https://stackoverflow.com/questions/51542304/how-to-plot-different-summary-metrics-on-the-same-plot-with-tensorboard) thread. – Piotr Rarus Nov 20 '19 at 08:45
  • There're plenty of tutorials on the [web](https://itnext.io/how-to-use-tensorboard-5d82f8654496). Just be aware few things changed in `TensorFlow 2.0`. – Piotr Rarus Nov 20 '19 at 08:47
  • Appreciate the help Piotr! – Janne Nov 20 '19 at 09:07
  • 1
    No problem. Let us know, if you encounter more problems. – Piotr Rarus Nov 20 '19 at 09:31