3

I've been working on a segmentation problem for many days and after finally finding out how to properly read the dataset, I ran into this problem:

ValueError: Error when checking target: expected activation_1(Softmax) to have 3 dimensions, but got array with shape

(32, 416, 608, 3)

I used the functional API, since I took the FCNN architecture from [here](https://github.com/divamgupta/image-segmentation-keras/blob/master/Models/FCN32.py).

It is slightly modified and adapted in accordance with my task(IMAGE_ORDERING = "channels_last"(TensorFlow backend)). Can anyone please help me? Massive thanks in advance. The architecture below is for FCNN, which I try to implement for the purpose of the segmentation. Here is the architecture(after calling model.summary()):

1. image

2. image

  1. The specific error is: image

  2. "Importing the dataset" function: image

  3. "Fit_Generator method calling": image

     img_input = Input(shape=(input_height,input_width,3))
    
     #Block 1
     x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', data_format=IMAGE_ORDERING)(img_input) 
     x = BatchNormalization()(x)
     x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', data_format=IMAGE_ORDERING)(x)
     f1 = x
     # Block 2
     x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', data_format=IMAGE_ORDERING )(x)
     f2 = x
    
     # Block 3
     x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', data_format=IMAGE_ORDERING )(x)
     f3 = x
    
     # Block 4
     x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2',data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3',data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', data_format=IMAGE_ORDERING)(x)
     f4 = x
    
     # Block 5
     x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2',data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3', data_format=IMAGE_ORDERING)(x)
     x = BatchNormalization()(x)
     x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)
     f5 = x
    
     x = (Convolution2D(4096,(7,7) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(x)
     x = Dropout(0.5)(x)
     x = (Convolution2D(4096,(1,1) , activation='relu' , padding='same',data_format=IMAGE_ORDERING))(x)
     x = Dropout(0.5)(x)
    
     #First parameter = number of classes+1 (de la background)
     x = (Convolution2D(20,(1,1) ,kernel_initializer='he_normal' ,data_format=IMAGE_ORDERING))(x)
     x = Convolution2DTranspose(20,kernel_size=(64,64), strides=(32,32),use_bias=False,data_format=IMAGE_ORDERING)(x)
     o_shape = Model(img_input,x).output_shape
    
     outputHeight = o_shape[1]
     print('Output Height is:', outputHeight)
     outputWidth = o_shape[2]
     print('Output Width is:', outputWidth)
     #https://keras.io/layers/core/#reshape
     x = (Reshape((20,outputHeight*outputWidth)))(x)
     #https://keras.io/layers/core/#permute
     x = (Permute((2, 1)))(x)
     print("Output shape before softmax is", o_shape)
     x = (Activation('softmax'))(x)
     print("Output shape after softmax is", o_shape)
     model = Model(inputs = img_input,outputs = x)
     model.outputWidth = outputWidth
     model.outputHeight = outputHeight
     model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics =['accuracy'])
    
Timbus Calin
  • 13,809
  • 5
  • 41
  • 59

3 Answers3

1

The original code in the FCNN architecture example works with an input dimension of (416, 608). Whereas in your code, the input dimension is (192, 192) (ignoring the channel dimension). Now if you notice carefully, this particular layer

x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)

generates an output of dimension (6, 6) (you can verify in your model.summary()).

The next convoltuion layer

o = (Convolution2D(4096,(7,7) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(o)

uses convolution filters of size (7, 7), but your input has already reduced to a size smaller than that (i.e. (6, 6)). Try fixing that first.

Also if you look at the model.summary() output, you'll notice that it does not contain the layers defined after the block5_pool layer. There is a transposed convolution layer in it (which basically upsamples your input). You may want to take a look and try to resolve that as well.

NOTE: In all my dimensions, I have ignored the channel dimension.


EDIT Detailed Answer below

First of all, this is my keras.json file. It uses Tensorflow backend, with image_ordering set at channel_last.

{
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
}

Next, I copy paste my exact model code. Please take special note of the inline comments in the code below.

from keras.models import *
from keras.layers import *

IMAGE_ORDERING = 'channels_last' # In consistency with the json file

def getFCN32(nb_classes = 20, input_height = 416, input_width = 608):

    img_input = Input(shape=(input_height,input_width, 3)) # Expected input will have channel in the last dimension

    #Block 1
    x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1', data_format=IMAGE_ORDERING)(img_input) 
    x = BatchNormalization()(x)
    x = Convolution2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool', data_format=IMAGE_ORDERING)(x)
    f1 = x
    # Block 2
    x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool', data_format=IMAGE_ORDERING )(x)
    f2 = x

    # Block 3
    x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool', data_format=IMAGE_ORDERING )(x)
    f3 = x

    # Block 4
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2',data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3',data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool', data_format=IMAGE_ORDERING)(x)
    f4 = x

    # Block 5
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2',data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = Convolution2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3', data_format=IMAGE_ORDERING)(x)
    x = BatchNormalization()(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool', data_format=IMAGE_ORDERING)(x)
    f5 = x

    x = (Convolution2D(4096,(7,7) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(x)
    x = Dropout(0.5)(x)
    x = (Convolution2D(4096,(1,1) , activation='relu' , padding='same',data_format=IMAGE_ORDERING))(x)
    x = Dropout(0.5)(x)

    x = (Convolution2D(20,(1,1) ,kernel_initializer='he_normal' ,data_format=IMAGE_ORDERING))(x)
    x = Convolution2DTranspose(20,kernel_size=(64,64), strides=(32,32),use_bias=False,data_format=IMAGE_ORDERING)(x)
    o_shape = Model(img_input, x).output_shape

    # NOTE: Since this is channel last dimension ordering, the height and width dimensions are along [1] and [2], not [2] and [3]
    outputHeight = o_shape[1]
    outputWidth = o_shape[2]

    x = (Reshape((outputHeight*outputWidth, 20)))(x) # Channel should be along the last dimenion of reshape
    # No need of permute layer anymore

    print("Output shape before softmax is", o_shape)
    x = (Activation('softmax'))(x)
    print("Output shape after softmax is", o_shape)
    model = Model(inputs = img_input,outputs = x)
    model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics =['accuracy'])

    return model

model = getFCN32(20)
print(model.summary())

Next I will provide with snippets of how my model.summary() looks. If you take a look at the last few layers, it is something like this: Model summary

So this means, the conv2d_transpose layer produces an output of dimension (448, 640, 20) and flattens it out before applying softmax on it. So dimension of output is (286720, 20). Similarly your target_generator (mask_generator in your case) should also generate targets of similar dimension. Similarly, your input_generator should also be producing input batches of size [batch size, input_height,input_width, 3], as mentioned in the img_input of your function.

Hopefully this will help you to get to the bottom of your problem and figure out a suitable solution. Please take a look at the minor variations in the code (along with the in-line comments) and how to create your input and target batches.

Koustav
  • 733
  • 1
  • 6
  • 21
  • Apparently the code line o=f5 did not function properly, so yes you were right in that regard. – Timbus Calin Apr 17 '18 at 12:30
  • I replaced all the functional calls with (x) at every step, but now I get another error – Timbus Calin Apr 17 '18 at 12:30
  • What is the new error? Also did you take care of your input dimension so that it does not reduce to a size smaller than (7, 7)? – Koustav Apr 17 '18 at 12:34
  • I believe this is due to the manner in which I import my dataset in the import function. ( flow_from_directory (batch_size =32, etc) – Timbus Calin Apr 17 '18 at 12:37
  • This line is supposed to resize your output from convolution layer and flatten it: o = (Reshape(( -1 , outputHeight*outputWidth )))(o) I see you have both this as well as a flatten layer. Can you please remove the flatten layer and check if you get the expected dimensional output, as given in this link: https://github.com/divamgupta/image-segmentation-keras/blob/master/Models/FCN32.py – Koustav Apr 17 '18 at 12:39
  • Ok so we're getting close to the good result. I deleted the Flatten layer and changed from (7,7) to (6,6) and now it gives me the error ("Error when checking target: expected activation_4 to have 3 dimensions, but got array with shape (32, 416, 608, 3") ). So I assume it expects to have the shape (416,608,3) instead of (32(presumably batch size), 416,608,3). What now? – Timbus Calin Apr 17 '18 at 13:28
  • No, it expects your size to be (batch size, height * width, channel). Converting it from (batch size, channel, height, width) to (batch size, channel, height * width) is what the reshape layer does. Finally the permute layer converts it to (batch size, height * width, channel). – Koustav Apr 17 '18 at 14:06
  • Using "channels_first" instead of "channels_last" does not solve the issue. What do you suggest I should do? – Timbus Calin Apr 17 '18 at 14:08
  • No. If you notice your error, it says your height and width are still separate. They are not merged. Please go through your errors carefully. There's enough information in it to get it resolved. – Koustav Apr 17 '18 at 14:12
  • Thank you, but trust me that If I knew how to solve this issue I would not have posted it here in the first place. I understand what Reshape and Permute layers do, yet I do not know what I should write/do in order to solve it now. – Timbus Calin Apr 17 '18 at 14:19
  • Change channel_last to channel_first and see if that resolves it. But from the error message you posted above, it seems reshape isn't giving the expected output. The height and width are supposed to be merged after that. – Koustav Apr 17 '18 at 14:28
  • No, it doesn't solve it. I tried it again but just like I stated in the comment above, this doesn't solve the problem. And yes, the height and width are supposed to be merged, hence my frustration :) – Timbus Calin Apr 17 '18 at 14:34
  • Print the dimension just before the reshape layer and see if there's some little bug in that. I'm not in front of my computer now, else would have debugged it and let you know. – Koustav Apr 17 '18 at 14:37
  • Output shape before reshape is (None, 448, 640, 20), Output Height is: 448 , Output Width is: 640 , Output shape after reshape (?, ?, 286720), Output shape after permute is (?, 286720, ?), Output shape after softmax is (?, 286720, ?). – Timbus Calin Apr 17 '18 at 15:05
  • So you see, before reshape, shape is (height, width, channel). That is channel last. But in reshape you're asking it to do (channel, height * width). That is, channel first. Hence the source of your error. Did you get it now? – Koustav Apr 17 '18 at 15:11
  • Yes, you are right. Writing Reshape( outputWidth * outputHeight , -1) indeed reshaped it correctly, so no need to permute it now. But now it has [batch_size,width*height,channel], and the error is the same. – Timbus Calin Apr 17 '18 at 15:52
  • I also left the original setting with channel_first and it still gives the same error. My opinion is that with the input shape(what I send there is a problem). But I don't understand why Keras's fit_generator simply would not work, as I took the image_preprocessing step from their documentation. – Timbus Calin Apr 17 '18 at 16:36
  • Yes, I agree with you Koustav, this is also exactly the output from the model.summary() in my case. My problem is that I do not know how to create the target batches with this shape in this case, because I use from Keras's documentation specifically for images&masks, which produce an output of dimension 4(for the target shape)(the generator produces such output). This is what I do not know, how to modify in(see image [4]) to correctly fit it in (see image[5]), the import of the dataset in order to fit it correctly to my architecture. If you know how to do that, I would be very grateful. – Timbus Calin Apr 18 '18 at 07:56
  • 1
    Write your own generator function. Some pointers are as follows: https://github.com/keras-team/keras/issues/8078#issuecomment-334909314 and https://stackoverflow.com/questions/46493419/use-a-generator-for-keras-model-fit-generator. Because anyway you are outputting a 20-channel image, hence just using native Keras image pre-processing functions may not be a good idea as they only work well for regular (RGB or Grayscale) images. You need to write your own generator function to load the target 20-channeled GT image. – Koustav Apr 18 '18 at 08:10
  • Massive thanks Koustav. As soon as I manage to do it, I will tell you :D – Timbus Calin Apr 18 '18 at 08:14
  • 1
    Take a look at this. This is a generator for generating segmentation input and GT. This may be useful to you. https://github.com/divamgupta/image-segmentation-keras/blob/master/LoadBatches.py – Koustav Apr 18 '18 at 08:16
0

I tried using SegNet architecture and yet again and I get the exact same error. It appears it is not an architectural problem, but one from fit_generator && from using masks.

UPDATE: The problem was solved by feeding the correct form of the input masks to the neural network.

Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
0

You're probably missing color_mode='grayscale' in the flow_from_directory() call for the mask. RGB is the default value for color_mode.

flow_args = dict(
    batch_size=batch_size,
    target_size=target_size,
    class_mode=None,
    seed=seed)

image_generator = image_datagen.flow_from_directory(
    image_dir, subset='training', **flow_args)

mask_generator = mask_datagen.flow_from_directory(
    mask_dir, subset='training', color_mode='grayscale', **flow_args)
Jay Borseth
  • 1,894
  • 20
  • 29