Keras Multi-Input CNN for Random-Dot Stereograms

Question

I'm currently developing a CNN to see if there is depth present or not in random-dot stereograms. The CNN takes in two inputs, the left and right images of the random-dot stereogram and outputs whether or not they contain depth or not. Here is an example of a random-dot stereogram:

Example of a random-dot stereogram that comprises of a left and a right image that when viewed with stereo-vision can contain percieved depth. Source of image: https://en.wikipedia.org/wiki/Random_dot_stereogram

This is my current Keras functional model architecture:

# Defining image dimensions
height = 100
width = height
channels = 3

# Define left and right inputs
left_input = Input(shape=(height, width, channels), name='left_RDS')
right_input = Input(shape=(height, width, channels), name='right_RDS')

# Convolutional Layers for Left Image
left_conv = Conv2D(filters=16, kernel_size=(3, 3), activation='relu', padding='same')(left_input)
# strides = number of pixels it moves at a time
left_pool = MaxPool2D(pool_size=(2, 2), strides=2)(left_conv)
left_conv = Conv2D(filters=32, kernel_size=(3, 3), activation='relu', padding='same')(left_pool)
left_pool = MaxPool2D(pool_size=(2, 2), strides=2)(left_conv)

# Convolutional Layers for Right Image
right_conv = Conv2D(filters=16, kernel_size=(3, 3), activation='relu', padding='same')(right_input)
right_pool = MaxPool2D(pool_size=(2, 2), strides=2)(right_conv)
right_conv = Conv2D(filters=32, kernel_size=(3, 3), activation='relu', padding='same')(right_pool)
right_pool = MaxPool2D(pool_size=(2, 2), strides=2)(right_conv)

# Merge the two streams of convolutional layers -> axis=-1 concatenate inputs along the last dimension
merged = Concatenate(axis=-1)([left_pool, right_pool])

# Flatten the merged features
flatten = Flatten()(merged)

# Dense Layers
dense = Dense(units=64, activation='relu')(flatten)
output = Dense(units=1, activation='sigmoid')(dense)

# Define the model with two inputs and one output
model = Model(inputs=[left_input, right_input], outputs=output, name='Simple_RDS_CNN')

What I'm currently having trouble with is creating a directory iterator for a dual input model. My current file directory is as follows, whereby the left and right images are the two inputs to the CNN:

train/ - left/ - depth/ - left_depth_img1.png - left_depth_img2.png - ... - no_depth/ - left_no_depth_img1.png - left_no_depth_img2.png - ... - right/ - depth/ - right_depth_img1.png - right_depth_img2.png - ... - no_depth/ - right_no_depth_img1.png - right_no_depth_img2.png - ...

I have managed to find a bit of code from here: https://github.com/keras-team/keras/issues/8130 which has enabled me to run the model for testing, however the issue is that bit of code only passing a single label for each batch which if affecting my performance and accuracy. The model also doesn't seem to be learning anything. I would rather each batch contains equal amounts of depth and no_depth images. Here is the code I'm using:

def generate_two_input(generator, dir1, dir2, batch_size, img_height, img_width):
    global i
    # Top input
    genX1 = generator.flow_from_directory(dir1,
                                          target_size=(img_height,img_width),
                                          classes = ['depth', 'no_depth'],
                                          batch_size=batch_size,
                                          shuffle=True,
                                          seed=seed)
    
    # Bottom input
    genX2 = generator.flow_from_directory(dir2,
                                          target_size=(img_height,img_width),
                                          classes = ['depth', 'no_depth'],
                                          batch_size=batch_size,
                                          shuffle=True,
                                          seed=seed)
    
    # Yields a batch of training data for CNN - produces two input images and one output label for each batch
    # This defines the iterator -> gets yielding images and labels in batches
    while True:
        # next -> retirieves next batch of images
        X1i = genX1.next()
        X2i = genX2.next()
                
        # [X1i[0], X2i[0]] = two input images -> arrays of pixels
        # X2i[1] = shared label -> arrays
        yield [X1i[0], X2i[0]], X2i[1]
                                          

            
train_left_dir = 'RDS\\train\\left'
train_right_dir = 'RDS\\train\\right'
valid_left_dir = 'RDS\\valid\\left'
valid_right_dir = 'RDS\\valid\\right'
                     
traingenerator = generate_two_input(generator=test_imgen,
                                    dir1=train_left_dir,
                                    dir2=train_right_dir,
                                    batch_size=batch_size,
                                    img_height=height,
                                    img_width=width)

validgenerator = generate_two_input(generator=valid_imgen,
                                    dir1=valid_left_dir,
                                    dir2=valid_right_dir,
                                    batch_size=batch_size,
                                    img_height=height,
                                    img_width=width)


model.fit(traingenerator,
            steps_per_epoch=RDS_train/batch_size,
            epochs=epochs,
            validation_data = validgenerator,
            validation_steps=RDS_valid/batch_size,
            use_multiprocessing=False,
            shuffle=False,
            verbose=2)

Would anyone be able to help me either refactor the current code I have to ensure every batch has a mix of depth and no_depth images or help me figure out a new way to pass both labels types within each batch? NOTE: I'm new to Keras/ CNN's so any help would be appreciated and I'm up for any suggestions for changing the model, file directory or anything else I have provided. Thank you :)

As noted, I have tried code from here: https://github.com/keras-team/keras/issues/8130 which isn't quite achieving what I want. I have looked at this issue here: Deep Learning Multi-Input CNN and Keras Sequential model with multiple inputs but neither have helped much.

Keras Multi-Input CNN for Random-Dot Stereograms

0 Answers0