0

I am working on 3D image segmentation with a convolutional neural network in Keras 2.1.1 with tensorflow as backend. I am using the fit_generator function because 3D images are very memory consuming and I am applying heavy data augmentation before each update.

Edit: I also referenced this post on github and added a small demo which demonstrates the problematic behavior: https://github.com/keras-team/keras/issues/8837

import sys
import numpy as np
import h5py
import random
import tensorflow as tf
import random
import scipy.misc as misc
import datetime
import time
import cv2


from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam


print("Loading net")

ada = Adam(lr=0.00005, beta_1=0.1, beta_2=0.001, epsilon=1e-08, decay=0.0)
net = Net(input_shape=(128,160,144, 4),outputChannel=5,momentum=0.5)
net.compile(optimizer="Adam",loss=jaccard_distance_loss)
print("Finished loading net")

filename = 'fold0_1.hdf5'
f = h5py.File(filename, 'r')

train_gen = generateData(f[u'train_x'],f[u'train_y'],augmentor=random_geometric_transformation)
val_gen = generateData(f[u'valid_x'],f[u'valid_y'])

filepath="Model-{epoch:02d}.h5"
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]

#Train Model
net.fit_generator(generator = train_gen,
                  steps_per_epoch = 200,
                  validation_data=val_gen,
                  validation_steps = 37,
                  epochs = 800,
                  callbacks=callbacks_list)

The problem is that although the error reported during training is relatively low, the error when predicting on a training or a test image is comparable to the minimally trained state (The first value is the overall loss, and the other values are the respective loss for each class. Note that the loss for the first class is close to 0 in an untrained model since it is just predicting background):

#Loss before training
[3.9978203773498535, 0.032198667526245117, 0.99983119964599609, 0.99984711408615112, 0.99907118082046509, 0.96687209606170654]

After training the network on any given (train/test) sample very briefly, the evaluation error will match the reported error during training:

Epoch 1/5
1/1 [==============================] - 9s 9s/step - loss: 2.0542 - slice_layer_1_loss: 0.0048 - slice_layer_2_loss: 0.9998 - slice_layer_3_loss: 0.3026 - slice_layer_4_loss: 0.6302 - slice_layer_5_loss: 0.1167
Epoch 2/5
1/1 [==============================] - 1s 592ms/step - loss: 2.0278 - slice_layer_1_loss: 0.0045 - slice_layer_2_loss: 0.9998 - slice_layer_3_loss: 0.2916 - slice_layer_4_loss: 0.6191 - slice_layer_5_loss: 0.1128
Epoch 3/5
1/1 [==============================] - 1s 582ms/step - loss: 2.0066 - slice_layer_1_loss: 0.0043 - slice_layer_2_loss: 0.9998 - slice_layer_3_loss: 0.2888 - slice_layer_4_loss: 0.6066 - slice_layer_5_loss: 0.1071
Epoch 4/5
1/1 [==============================] - 1s 590ms/step - loss: 1.9909 - slice_layer_1_loss: 0.0042 - slice_layer_2_loss: 0.9998 - slice_layer_3_loss: 0.2872 - slice_layer_4_loss: 0.5959 - slice_layer_5_loss: 0.1038
Epoch 5/5
1/1 [==============================] - 1s 572ms/step - loss: 1.9787 - slice_layer_1_loss: 0.0041 - slice_layer_2_loss: 0.9998 - slice_layer_3_loss: 0.2855 - slice_layer_4_loss: 0.5875 - slice_layer_5_loss: 0.1019

#Loss after training
1/1 [==============================] - 0s 190ms/step
[2.1015677452087402, 0.0048453211784362793, 0.99983119964599609, 0.33013522624969482, 0.64043641090393066, 0.1263195276260376]

The problem seems very similar to this post: ResNet: 100% accuracy during training, but 33% prediction accuracy with the same data

However, the model used in this demo is already trained for 200 epochs over the full dataset and a higher number of epochs does not solve the problem. Furthermore, the reported error on the validation set during training is not decreasing by any means, but predicting on test images shows the same behavior as described above.

Is it possible, that there is some kind of problem when using batch normalization with the fit_generator when the batch size is only 1?

PS: Here the code for the generator

import numpy as np
import random

def reverseArgMax(array,n):
    newArray = np.empty(list(array.shape+(n,)))

    for i in range(n):
        temp = array.copy()

        if i==1:
            temp[temp!=1] = 0
        elif i==0:
            temp[temp==1] = 2
            temp[temp==i] = 1
            temp[temp!=1] = 0
        else:
            temp[temp==1] = 0
            temp[temp==i] = 1
            temp[temp!=1] = 0
        newArray[...,i]=temp
    return newArray

def generateData(data,labels,augmentor=None,batch_size=1):
    #Generates batches of samples

    while 1:
        # Generate batches
        imax = list(range(len(data)))
        np.random.shuffle(imax)

        for i in imax:
            x = np.array(data[i])
            y = np.array((labels[i]))
            x = np.transpose(x,axes=[0,2,3,1])
            x = np.array([x])
            y = np.array([y])
            x = x.astype(np.float32)
            y = y.astype(np.float32)
            y=reverseArgMax(y,5)

            #Augment
            if augmentor!=None:
                x,y = augmentor(x,y)

            x = x.astype(np.float32)
            y = y.astype(np.float32)
            y = np.transpose(y,[4,0,1,2,3])
            y = np.reshape(y,[5,1,-1])

            yield x, list(y)
  • What are you doing in the method random_geometric_transformation ? If you are subtracting mean from the data then you should do that in validation data as well. – Pranjal Sahu Dec 19 '17 at 18:58
  • @sahu Sorry, I should have added that. The function is just returning a rotated and/or flipped version, or the original image itself. I also observed this problem when using an identical generator for the validation data which also yields training images with the same form of augmentation – Lukas Brinkmeyer Dec 19 '17 at 20:37

0 Answers0