Keras model is only predicting one label

Question

I have developed a paraphrase detection model (Yes/ No) which gets two phrases as input and is supposed to return whether it's paraphrased version or not.

Based on suggestions here I ensured that there is no class imbalance in training dataset:

This is my model:

left_input = Input(shape=(120, ))
right_input = Input(shape=(120, ))
left_embedding = Embedding(vocab_size, 120, input_length=max_length)(left_input)
right_embedding = Embedding(vocab_size, 120, input_length=max_length)(right_input)
left_lstm = LSTM(120, input_shape=(1, 120))(left_embedding)
right_lstm = LSTM(120, input_shape=(1, 120))(right_embedding)
concat = concatenate([left_lstm, right_lstm], name='Concatenate')
model_output = Dense(1, activation='softmax')(concat)

model = Model(inputs=[left_input, right_input], outputs=model_output, name='Final_output')
model.compile(optimizer='adam', loss='binary_crossentropy')

model.summary()

These are the predictions my model made:

Can anybody point out what is the problem here?

Update-1 Bu replacing "softmax" with "sigmoid", I get following values (all same):

Which values can `Dense(1, activation='softmax')` output? It is only `1`, your network only outputs `1` in the end. — Frightera, Apr 09 '22 at 11:15
@Frightera I used one because it will be either 0 or 1. Is that incorrect? — Ashar, Apr 09 '22 at 11:21
Btw I did try using 2 but then I get the error in model.fit "ValueError: `logits` and `labels` must have the same shape, received ((None, 2) vs (None, 1))." — Ashar, Apr 09 '22 at 11:22
Your labels should be one hot encoded for this, or you to need use `sigmoid` instead of `softmax`. — Frightera, Apr 09 '22 at 11:22
So I used sigmoid but effectively it does the same thing. Insead of all 1's it is predicting all values to be 0.472. Please see the updated post. — Ashar, Apr 09 '22 at 11:35
@Frightera Same is the case if I use OneHotEncode, it just predicts the same two values for all the labels. — Ashar, Apr 09 '22 at 11:42
Model configuration is correct for this task (BCE + Sigmoid + 1 neuron), there might be an issue with data or model architecture (units etc.) — Frightera, Apr 09 '22 at 16:58
@Frightera The data is (guaranteed) correct because it is provided by an external party. Can you please suggest what could be a problem with model architecture? I have a fairly simple model in place currently. — Ashar, Apr 10 '22 at 09:57

Keras model is only predicting one label

0 Answers0