model does not generalize in test data using U-net

Question

I'm working in Multi-class segmentation of medical images using U-net. I have small dataset equal to 681 sample and 681 GT.

inputs = tf.keras.layers.Input((IMG_WIDHT, IMG_HEIGHT, IMG_CHANNELS))
smooth = 1.

s = tf.keras.layers.Lambda(lambda x: x / 255)(inputs)
c1 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    s)  # Kernelsize : start with some weights initial value
c1 = tf.keras.layers.Dropout(0.1)(c1)
c1 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    c1)  # Kernelsize : start with some weights initial value
p1 = tf.keras.layers.MaxPool2D((2, 2))(c1)

c2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    p1)  # Kernelsize : start with some weights initial value
c2 = tf.keras.layers.Dropout(0.1)(c2)
c2 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    c2)  # Kernelsize : start with some weights initial value
p2 = tf.keras.layers.MaxPool2D((2, 2))(c2)

c3 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    p2)  # Kernelsize : start with some weights initial value
c3 = tf.keras.layers.Dropout(0.1)(c3)
c3 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    c3)  # Kernelsize : start with some weights initial value
p3 = tf.keras.layers.MaxPool2D((2, 2))(c3)

c4 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    p3)  # Kernelsize : start with some weights initial value
c4 = tf.keras.layers.Dropout(0.1)(c4)
c4 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    c4)  # Kernelsize : start with some weights initial value
p4 = tf.keras.layers.MaxPool2D((2, 2))(c4)

c5 = tf.keras.layers.Conv2D(256, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    p4)  # Kernelsize : start with some weights initial value
c5 = tf.keras.layers.Dropout(0.1)(c5)
c5 = tf.keras.layers.Conv2D(256, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(
    c5)  # Kernelsize : start wi

u6 = tf.keras.layers.Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(c5)
u6 = tf.keras.layers.concatenate([u6, c4])
c6 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(u6)
c6 = tf.keras.layers.Dropout(0.2)(c6)
c6 = tf.keras.layers.Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c6)

u7 = tf.keras.layers.Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(c6)
u7 = tf.keras.layers.concatenate([u7, c3])
c7 = tf.keras.layers.Conv2D(64, (2, 2), activation='relu', kernel_initializer='he_normal', padding='same')(u7)
c7 = tf.keras.layers.Dropout(0.2)(c7)
c7 = tf.keras.layers.Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c7)

u8 = tf.keras.layers.Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same')(c7)
u8 = tf.keras.layers.concatenate([u8, c2])
c8 = tf.keras.layers.Conv2D(32, (2, 2), activation='relu', kernel_initializer='he_normal', padding='same')(u8)
c8 = tf.keras.layers.Dropout(0.1)(c8)
c8 = tf.keras.layers.Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c8)

u9 = tf.keras.layers.Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same')(c8)
u9 = tf.keras.layers.concatenate([u9, c1], axis=3)
c9 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(u9)
c9 = tf.keras.layers.Dropout(0.1)(c9)
c9 = tf.keras.layers.Conv2D(16, (3, 3), activation='relu', kernel_initializer='he_normal', padding='same')(c9)
outputs = tf.keras.layers.Conv2D(12, (1, 1), activation='softmax')(c9)

to initialize weight I'm using a weighted_categorical_crossentropy

def weighted_categorical_crossentropy(weights):
    # weights = [0.9,0.05,0.04,0.01]
    def wcce(y_true, y_pred):
        Kweights = K.constant(weights)
        if not K.is_tensor(y_pred): y_pred = K.constant(y_pred)
        y_true = K.cast(y_true, y_pred.dtype)
        return K.categorical_crossentropy(y_true, y_pred) * K.sum(y_true * Kweights, axis=-1)
    return wcce

losse = weighted_categorical_crossentropy(poids)

hyperparameter

cc = tf.keras.optimizers.Adam(learning_rate=0.0001, beta_1=0.9, beta_2=0.999, amsgrad=False)
model.compile(optimizer=cc, loss=losse,
              metrics=['categorical_accuracy'])  

history = model.fit(X_train, Y_train, validation_split=0.18, batch_size=1,epochs = 50) 

558/558 [==============================] - 333s 597ms/sample - loss: 0.0281 - categorical_accuracy: 0.9539 - val_loss: 0.2262 -

I'm so confused about the result because I found some paper that work with same size of dataset and get a better result.

I tried to modify dropout from 0.1 too 0.4 and 0.2 too 0.5 and get same result.

I tried also adding regularization in softmax layer and before softmax layer and the result is an high loss.

Data augmentation does not work for me since i must preserve the range of pixels and using Keras generator make my GT range of pixels change.

So my question is why my model does not generalize to test data.

What *exactly* is your issue and your *question*? Validation loss being higher than training loss is the norm, and usually people start wondering when the opposite happens — desertnaut, Apr 20 '20 at 11:37
@desertnaut thanks for your answer. my question is that my model does not generalize and give good result in test data — SamBn04, Apr 20 '20 at 11:39
This is not at all what you have asked. Please edit & update your post (including the title) to explicitly clarify this; as is, your post makes no sense at all. — desertnaut, Apr 20 '20 at 11:40
if validation loss is high then training loss so that mean there is an overfitting and if there is an overfitting model does not generalize to test data? — SamBn04, Apr 20 '20 at 11:46
This is not (necessarily) overfitting (which, IMHO, is a much abused term); see own answer [here](https://stackoverflow.com/questions/61043558/dropout-with-densely-connected-layer/61043883#61043883). — desertnaut, Apr 20 '20 at 11:53
@desertnaut please can you confirm if the modified categorical_crossentropy is correct [https://stackoverflow.com/questions/61309991/how-to-use-weighted-categorical-crossentropy-loss-function] — SamBn04, Apr 20 '20 at 11:53

model does not generalize in test data using U-net

0 Answers0