2

I am currently attempting to create a CNN which utilises both numerical and image data. The structure of the CNN currently is only image processing, though I need to add the numerical data ontop.

For each image I have I also have an uncatagorised CSV reprenting numerical data about this image: eg. subject-1.jpg has an equivalent subject-1.csv. As stated I currently have a CNN which creates a model using only the images, though I am wondering how I can encorperate the numerical data to improve the accuracy of the CNN.

If anyone could assist me or point me in the right direction it would be much appreciated.

My current process looks like this:

Create datasets for image testing and image validation, numerical testing and numerical validation, utilising this function body:

dataset = tf.keras.preprocessing.image_dataset_from_directory(
path,
validation_split=0.2,
subset=setType,
seed=123,
image_size=(IMG_SIZE, IMG_SIZE),
batch_size=8)

Configuration:

AUTOTUNE = tf.data.AUTOTUNE

train_image_data = train_image_data .cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
test_iamge_data = test_iamge_data .cache().prefetch(buffer_size=AUTOTUNE)

train_num_data = train_num_data .cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
test_num_data = test_num_data .cache().prefetch(buffer_size=AUTOTUNE)

Normalise:

normalization_layer = layers.experimental.preprocessing.Rescaling(1./255)
norm_img_ds = train_image_data .map(lambda x, y: (normalization_layer(x), y))
norm_num_ds = train_num_data .map(lambda x, y: (normalization_layer(x), y))

Then creating the model:

num_classes = len(CATAGORIES)

model = Sequential([
  layers.experimental.preprocessing.Rescaling(1./255, input_shape=(IMG_SIZE, IMG_SIZE, 3)),
  layers.Conv2D(16, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(32, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Conv2D(64, 3, padding='same', activation='relu'),
  layers.MaxPooling2D(),
  layers.Flatten(),
  layers.Dense(128, activation='relu'),
  layers.Dense(num_classes)
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
Fin M.
  • 139
  • 10
  • First of all you should use Functional API instead of Sequential model. Since sequential has limitation of 1 input and 1 output. Using functional API helps you to create a model with 2 inputs and then somewhere in your model, you may concat their output and then classification layer. If you give the shapes of your data, I may suggest you some code to do it. – Kaveh Aug 15 '21 at 17:15
  • @Kaveh Thank you for your input, I didn't know about this. The shapes of the data is (480, 480, 3) for the image data and the csv is (480, 640). – Fin M. Aug 16 '21 at 12:56

1 Answers1

3

First of all you should use Functional API rather than Sequential model. Since sequential models has limitation of 1 input and 1 output. Using functional API helps you to create a model with 2 inputs and then somewhere in your model, you may concat their outputs and then 1 classification layer as output.

Here is some example code:

IMG_SIZE = 480
img_data_shape = (IMG_SIZE, IMG_SIZE, 3)
csv_data_shape = (480, 640)
num_classes = 2

# define two inputs layers
img_input = tf.keras.layers.Input(shape=img_data_shape, name="image")
csv_input = tf.keras.layers.Input(shape=csv_data_shape, name="csv")

# define layers for image data 
x1 = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)(img_input)
x1 = tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu', name="conv1_img")(x1)
x1 = tf.keras.layers.MaxPooling2D(name="mxp1_img")(x1)
x1 = tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu', name="conv2_img")(x1)
x1 = tf.keras.layers.MaxPooling2D(name="mxp2_img")(x1)
x1 = tf.keras.layers.Conv2D(64, 3, padding='same', activation='relu', name="conv3_img")(x1)
x1 = tf.keras.layers.MaxPooling2D(name="mxp3_img")(x1)
x1 = tf.keras.layers.Flatten(name="flatten_img")(x1)

# define layers for csv data
x2 = tf.keras.layers.Flatten(name="flatten_csv")(csv_input)
x2 = tf.keras.layers.Dense(16, activation='relu', name="dense1_csv")(x2)
x2 = tf.keras.layers.Dense(32, activation='relu', name="dense2_csv")(x2)
x2 = tf.keras.layers.Dense(64, activation='relu', name="dense3_csv")(x2)

# merge layers
x = tf.keras.layers.concatenate([x1,x2], name="concat_csv_img")
x = tf.keras.layers.Dense(128, activation='relu', name="dense1_csv_img")(x)
output = tf.keras.layers.Dense(num_classes, name="classify")(x)

# make model with 2 inputs and 1 output
model = tf.keras.models.Model(inputs=[img_input, csv_input], outputs=output)

model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

And here is the architecture: model_arch


UPDATE: For feeding your 2 inputs and 1 label, based on your code, it may be something like this:

Create sample dataset:

BATCH_SIZE = 8
train_image_data = tf.keras.utils.image_dataset_from_directory(dir_path,
                                       validation_split=0.2, subset="training",
                                       seed=123, label_mode=None,
                                       image_size=(IMG_SIZE, IMG_SIZE), batch_size=BATCH_SIZE)
test_image_data = tf.keras.utils.image_dataset_from_directory(dir_path,
                                       validation_split=0.2, subset="validation",
                                       seed=123, label_mode=None,
                                       image_size=(IMG_SIZE, IMG_SIZE), batch_size=BATCH_SIZE)
'''
Found 327 files belonging to 2 classes.
Using 262 files for training.
Found 327 files belonging to 2 classes.
Using 65 files for validation.
'''
# generate random csv data
# number of samples should be equal to images
# in other words for each image you should have 1 corresponding csv entry
train_num_data = tf.random.uniform((262,480,640))
test_num_data = tf.random.uniform((65,480,640))

# create csv dataset
train_num_data = tf.data.Dataset.from_tensor_slices(train_num_data).batch(BATCH_SIZE)
test_num_data = tf.data.Dataset.from_tensor_slices(test_num_data).batch(BATCH_SIZE)

# generate random labels
y_train = tf.data.Dataset.from_tensor_slices(tf.random.uniform((262,1))).batch(BATCH_SIZE)
y_test = tf.data.Dataset.from_tensor_slices(tf.random.uniform((65,1))).batch(BATCH_SIZE)

Define a generator:

def my_gen(subset):
    while True:
        if subset == "training":
            for i in train_image_data.take(1):
                img_batch = i
            for j in train_num_data.take(1):
                csv_batch = j
            for k in y_test.take(1):
                labels_batch = k
        else:
            for i in test_image_data.take(1):
                img_batch = i
            for j in test_num_data.take(1):
                csv_batch = j
            for k in y_test.take(1):
                labels_batch = k

        yield ((img_batch, csv_batch), labels_batch)

gen_train = my_gen("training")
gen_valid = my_gen("validation")

Then train the model:

model.fit(gen_train, epochs=2, steps_per_epoch=3, validation_data=gen_valid, validation_steps=1)
Kaveh
  • 4,618
  • 2
  • 20
  • 33
  • Thank you for this, it is very informative! I am facing an issue, however, it seems that the functional API and then train the model with: `histroy = model.fit(x=[TRAINING_DATA, TRAINING_DATA_DEPTH], validation_data=([TESTING_DATA, TESTING_DATA_DEPTH]), epochs=5, batch_size=8)` I receive an error saying `Failed to find data adapter that can handle input: ( containing values of types {""}), ` which is caused by the dataset processing function I believe. How should I process the image data for func API? – Fin M. Aug 16 '21 at 19:50
  • 1
    @FinnMckinn For feeding your data (2 inputs and 1 output), you need to write a custom generator to do it for you, then you can simply pass that generator. Take a look at this, which implemented what you need: https://stackoverflow.com/a/67162437/2423278 – Kaveh Aug 16 '21 at 21:29
  • 1
    @FinnMckinn I also added some sample code, for your convenience to explain how feeding to your model should looks like. – Kaveh Aug 16 '21 at 22:51
  • Again, thank you very much for your input, this is really assisting me. However, when model.fit() is called, the command seems to hang for a while and then will give out this error: `ValueError: Layer model_1 expects 2 input(s), but it received 4 input tensors. Inputs received: [, , , ]` . Im trying to fix this to no avail. – Fin M. Aug 17 '21 at 14:19