0

I have developed a paraphrase detection model (Yes/ No) which gets two phrases as input and is supposed to return whether it's paraphrased version or not.

Based on suggestions here I ensured that there is no class imbalance in training dataset:

Train labels

This is my model:

left_input = Input(shape=(120, ))
right_input = Input(shape=(120, ))
left_embedding = Embedding(vocab_size, 120, input_length=max_length)(left_input)
right_embedding = Embedding(vocab_size, 120, input_length=max_length)(right_input)
left_lstm = LSTM(120, input_shape=(1, 120))(left_embedding)
right_lstm = LSTM(120, input_shape=(1, 120))(right_embedding)
concat = concatenate([left_lstm, right_lstm], name='Concatenate')
model_output = Dense(1, activation='softmax')(concat)

model = Model(inputs=[left_input, right_input], outputs=model_output, name='Final_output')
model.compile(optimizer='adam', loss='binary_crossentropy')

model.summary()

These are the predictions my model made:

Predictions

Confusion matrix

Can anybody point out what is the problem here?

Update-1 Bu replacing "softmax" with "sigmoid", I get following values (all same):

enter image description here

Ashar
  • 724
  • 10
  • 30
  • Which values can `Dense(1, activation='softmax')` output? It is only `1`, your network only outputs `1` in the end. – Frightera Apr 09 '22 at 11:15
  • @Frightera I used one because it will be either 0 or 1. Is that incorrect? – Ashar Apr 09 '22 at 11:21
  • Btw I did try using 2 but then I get the error in model.fit "ValueError: `logits` and `labels` must have the same shape, received ((None, 2) vs (None, 1))." – Ashar Apr 09 '22 at 11:22
  • Your labels should be one hot encoded for this, or you to need use `sigmoid` instead of `softmax`. – Frightera Apr 09 '22 at 11:22
  • So I used sigmoid but effectively it does the same thing. Insead of all 1's it is predicting all values to be 0.472. Please see the updated post. – Ashar Apr 09 '22 at 11:35
  • @Frightera Same is the case if I use OneHotEncode, it just predicts the same two values for all the labels. – Ashar Apr 09 '22 at 11:42
  • Model configuration is correct for this task (BCE + Sigmoid + 1 neuron), there might be an issue with data or model architecture (units etc.) – Frightera Apr 09 '22 at 16:58
  • @Frightera The data is (guaranteed) correct because it is provided by an external party. Can you please suggest what could be a problem with model architecture? I have a fairly simple model in place currently. – Ashar Apr 10 '22 at 09:57

0 Answers0