0

I am trying to create a CNN to classify text input into 4 bins and I am having the problem that the model will always predict the same class. I have used other simpler machine learning algorithms with the same preprocessing I am using with this CNN, and I achieve satisfactory results, so I am confident my preprocessing is fine.

My model is defined as follows:

model = keras.Sequential(
    [
        keras.layers.Input(shape=[vect_len], name="wide_input"),
        keras.layers.Embedding(vect_len, 100, input_length=vect_len),
        keras.layers.Conv1D(95, 4, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.001, l2=0.01), bias_regularizer=regularizers.l2(0.04)),
        keras.layers.GlobalMaxPooling1D(data_format='channels_first'),
        keras.layers.Dense(4, activation='sigmoid')
    ]
)

And trained as follows:

model.summary()
model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.0006), loss='categorical_crossentropy', metrics=['accuracy'])
batch_size = 16
epochs = 50
history = model.fit(input_train, target_train, epochs=epochs, batch_size=batch_size, validation_split=0.2, shuffle=True)

And after training my resulting confusion matrix is as follows.

Confusion Matrix

Following on the advice from this post (https://stackoverflow.com/a/41493375/6543460) I reduced my data set to simply all data of a class i to see if the model could predict that value when I tried this for each of the 4 classes.

I then went on to create a size 10 dataset with 5 instances of 2 distinct datapoints to see if the model could classify them into their correct classes, but it couldn't. I also tried 15 datapoints from 3 classes as follows,

target_train = np.array([
    [0., 0., 0., 1.],
    [0., 0., 0., 1.],
    [0., 0., 0., 1.],
    [0., 0., 0., 1.],
    [0., 0., 0., 1.],
    [1., 0., 0., 0.],
    [1., 0., 0., 0.],
    [1., 0., 0., 0.],
    [1., 0., 0., 0.],
    [1., 0., 0., 0.],
    [0., 0., 1., 0.],
    [0., 0., 1., 0.],
    [0., 0., 1., 0.],
    [0., 0., 1., 0.],
    [0., 0., 1., 0.]
])
target_train



input_train = np.array([
    np.copy(a),
    np.copy(a),
    np.copy(a),
    np.copy(a),
    np.copy(a),
    np.copy(b),
    np.copy(b),
    np.copy(b),
    np.copy(b),
    np.copy(b),
    np.copy(c),
    np.copy(c),
    np.copy(c),
    np.copy(c),
    np.copy(c)
])

Confusion Matrix

which also didn't work. So, using this same size 15 dataset I tried simplifying my model until I could predict them accurately. Using just the dense layer I get what I need.

model = keras.Sequential(
    [
        keras.layers.Input(shape=[vect_len], name="wide_input"),
        # keras.layers.Embedding(vect_len, 100, input_length=vect_len),
        # keras.layers.Conv1D(95, 4, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.001, l2=0.01), bias_regularizer=regularizers.l2(0.04)),
        # keras.layers.GlobalMaxPooling1D(data_format='channels_first'),
        keras.layers.Dense(4, activation='sigmoid')
    ]
)

Confusion Matrix

But with just the embedding and the pooling layers (excluding the convolution layer) it doesn't work.

model = keras.Sequential(
    [
        keras.layers.Input(shape=[vect_len], name="wide_input"),
        keras.layers.Embedding(vect_len, 100, input_length=vect_len),
        # keras.layers.Conv1D(95, 4, activation='relu', kernel_regularizer=regularizers.l1_l2(l1=0.001, l2=0.01), bias_regularizer=regularizers.l2(0.04)),
        keras.layers.GlobalMaxPooling1D(data_format='channels_first'),
        keras.layers.Dense(4, activation='sigmoid')
    ]
)

Confusion Matrix

I think there is some way in these steps I could be losing information, but I can't figure out what's happening specifically or how to fix it.

0 Answers0