Binary Classification using softmax LSTM

Question

I am trying to design bi-directional LSTM using word2vec features for binary classification problem.

my_model=Sequential()

my_model.add(Embedding(words,10,input_length=trainDataVecs.shape[1],weights=[embedding_matrix],trainable=True))
my_model.add(Bidirectional(LSTM(20,activation='tanh',init='glorot_uniform',recurrent_dropout = 0.2, dropout = 0.2)))

Since there are two classes,

my_model.add(Dense(2, activation='softmax'))

def auc(y_true, y_pred):
    auc = tf.metrics.auc(y_true, y_pred)[1]
    K.get_session().run(tf.local_variables_initializer())
    return auc

Using AUC as the metrics

print "Compiling..."
optimizer=RMSprop(lr=0.0001, rho=0.9, epsilon=1e-08)
my_model.compile(optimizer=optimizer,
               loss='categorical_crossentropy',
                  metrics=[auc])

my_model.fit(trainDataVecs, Y_train, shuffle = True, batch_size = 10, epochs=20)

But I am getting : ValueError: Error when checking target: expected dense_2 to have shape (2,) but got array with shape (1,) as Y_train is 1-D. I can use sigmoid instead of softmax but that will give me the probability when I will predict. My labels are 0 or 1. Hence, I want to see the F-score between the predicted and the real values.

You need to convert your label to categorical. You can use the to_categorical method defined in [Keras Utils](https://keras.io/utils/) to do this. Here your num_classes would be 2. — kvish, Oct 08 '18 at 19:50
@kvish: Thanks. Should I change the loss? I know that I should use `categorical_crossentropy` but I am not sure if I should do that in Binary classification problem. — amy, Oct 11 '18 at 16:45
[this answer](https://stackoverflow.com/a/46038271/10111931) might help you understand a bit more about that. Essentially, since you are using your own metric, its fine. — kvish, Oct 11 '18 at 16:57
@kvish I saw that post and that's why I asked you. People mention that with to_cateogrical, we need to use categorical_crossentropy. But I just want to make sure what is a good way for binary classification problem. Also, when I used sigmoid as activation for the last layer, it gave me probability, so I am not sure what is the better method. — amy, Oct 11 '18 at 17:00
If you want to go that way, you could define one node, and use sigmoid activation and do your job. I think it should be pretty fine for using categorical, unless you have reasons for having a sigmoid activation instead of using a softmax layer. You get the added benefit of scaling your model to more classes by just changing the no of Dense nodes — kvish, Oct 11 '18 at 17:27

Binary Classification using softmax LSTM

0 Answers0