0

I have a multi-label classification problem that I am trying to solve with Neural Network using Tensorflow 2.

The problem - I am trying to predict a cause and its corresponding severity. I can have n number of causes and each of the causes can have m possible severity.

Let's say for simplicity

  • number of causes = 2
  • number of each causes possible severity = 2
  • So we essentially have 4 possible outputs
  • We also have 4 possible input features

I wrote below code -

import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras import Model
from tensorflow.keras.callbacks import ModelCheckpoint

def get_model_multilabel(n_inputs, n_outputs):
    opt = tf.keras.optimizers.SGD(lr=0.01, momentum=0.9)
    model = tf.keras.models.Sequential([
        #input layer
        Dense(10, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'),
        ## two hidden layer
        Dense(10, kernel_initializer='he_uniform', activation='relu'),
        Dropout(0.2),
        Dense(5, kernel_initializer='he_uniform', activation='relu'),
        Dropout(0.2),
        ## output layer
        Dense(n_outputs, activation='sigmoid')
    ])
    model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
    return model

n_inputs = 4 # because we have 4 features
n_outputs = 4 # because we have 4 labels

mlmodel = get_model_multilabel(n_inputs, n_outputs)

## train the model
mlmodel.fit(X_train,y_train, epochs=50, batch_size=32, validation_split = 0.2, callbacks=callbacks_list)

X_train.shape is (1144, 4) and y_train.shape is (1144,)

Note the sigmoid activation in the last layer and the binary_crossentropy loss function as I am trying to model a multi-label classification problem. Reference How do I implement multilabel classification neural network with keras

When I train this, it throws error

ValueError: logits and labels must have the same shape ((None, 4) vs (None, 1))

Not sure what am I missing here. Please suggest.

nad
  • 2,640
  • 11
  • 55
  • 96

1 Answers1

0

Your Y_train is incorrect in shape it should be (1144,n_outputs) , instead it is (1144,) , which if reshaped is (1144,1) . Your code dosent know the number of samples so it becomes (None,1) . It must match with output shape or (None,4). You have loaded the data incorrectly.

desertnaut
  • 57,590
  • 26
  • 140
  • 166