Keras merging class prediction with labels

Question

When training my network, I have a multi label classification problem in which I convert the class labels into one hot encoding.

After training the model, and generating predictions - keras simply outputs an array of values without specifying the class label.

What is best practice to merge these, so my API can return meaningful results to the consumer?

Example

y = pd.get_dummies(df_merged.eventId)
y

2CBC9h3uple1SXxEVy8W    GiiFxmfrUwBNMGgFuoHo    e06onPbpyCucAGXw01mM
12  1                   0                       0
13  1                   0                       0
14  1                   0                       0

prediction = model.predict(pred_test_input)
prediction
array([[0.5002058 , 0.49697363, 0.50251794]], dtype=float32)

Desired outcome: {results: { 2CBC9h3uple1SXxEVy8W: 0.5002058, ...}

EDIT: Adding model as per comment - but this is just a toy model.

model = Sequential()
model.add(
  Embedding(
    input_dim=embeddings_index.shape[0],
    output_dim=embeddings_index.shape[1],
    weights=[embeddings_index],
    input_length=MAX_SEQ_LENGTH,
    trainable=False,
  )
)
model.add(LSTM(300))
model.add(Dense(units=len(y.columns), activation='sigmoid'))
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

EDIT 2 - adding y.

So my y is in the following format:

eventId
123
123
234
...

I then use y = pd.get_dummies(df_merged.eventId) to convert this into something the model can consume and would like to append the eventIds back to the predictions.

This depends on the model structure, please include it, specially the output layers — Dr. Snoopy, Jun 24 '19 at 15:37
@MatiasValdenegro added - but this is just a placeholder really. — dendog, Jun 24 '19 at 16:52

Dr. Snoopy · Answer 1 · 2019-06-25T18:42:01.197

2

First of all, if you are doing multi-label classification, then you should use the binary_crossentropy loss:

model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])

Then it is important to say that keras' accuracy does not consider multi-label classification, so it will be a misleading metric. More appropriate metrics are precision/recall for each class.

To get class predictions, you have to threshold each class' predictions, and it is a threshold that you have to tune (it does not have to be the same for each class), so for example:

class_names = y.columns.tolist()
pred_classes = {}
preds = model.predict(pred_test_input)

thresh = 0.5
for i in range(num_classes):
    if preds[i] > thresh:
        pred_classes[class_name[i]] = preds[i]

This will output the pred_classes dictionary with the classes over the threshold, and include a confidence score.

edited Jun 25 '19 at 18:42

answered Jun 24 '19 at 20:33

Dr. Snoopy

55,122
7
121
140

Appreciate the reply, however where does `class_name` come from the `preds` list does not have any reference to the labels. – dendog Jun 25 '19 at 10:40
@dendog That is external to Keras, its just a mapping from class index to class name, its the inverse operation from when you encoded your labels into a numeric form. – Dr. Snoopy Jun 25 '19 at 11:09
This is exactly the part I wanted help with - to ensure they are mapped correctly. Do you have an example of a helper function here to create that mapping? I am going to add my code which creates dummies. – dendog Jun 25 '19 at 11:11
Also @Matias Valdenegro please check your comment regarding binary cross entropy most of the resources I see state it is for two classes only. – dendog Jun 25 '19 at 14:46
@dendog There is nothing to check, you are doing multi-label classification, which is N binary classification problems, where N is the number of classes. – Dr. Snoopy Jun 25 '19 at 15:28
@dendog So you can use class_names = y.columns.tolist(), it should give you the array mapping. – Dr. Snoopy Jun 25 '19 at 18:41
take a look at this - sorry to keep harping on but just trying to understand https://stackoverflow.com/questions/42081257/keras-binary-crossentropy-vs-categorical-crossentropy-performance – dendog Jun 27 '19 at 14:08
@dendog What should I look there? It is a bit rude to just throw links to people. – Dr. Snoopy Jun 27 '19 at 14:10
Sorry @Matias Valdenegro - was not trying to be rude, it was in relation to the binary vs categorical loss discussion – dendog Jun 27 '19 at 14:22
@dendog You told us that you are doing multi-label classification, see this then: https://stackoverflow.com/a/49175655/349130 – Dr. Snoopy Jun 27 '19 at 14:27
Ok thanks! Seems you are correct - I did not expect the distinction between class/label – dendog Jun 27 '19 at 14:32

Keras merging class prediction with labels

1 Answers1

Linked