I have and LSTM sequence tagger in Keras which I use for highly unbalanced data. Therefore, I'd like to use the (multiclass) F1-score as the model's main metric. I have 2 questions:
1) I use zero-padding in the data (and thus mask_zero=True
in my embeddings), and all the losses are computed for masked data automatically. However, I suppose that masking has to be done manually for custom metrics computation? Is there an efficient vectorized solution for that?
2) Is it possible to pass sklearn's f1_score implementation into the model's compile
(maybe after wrapping it in some way)? Right off the bat, it didn't work because apparently a placeholder was passed into it rather than a numpy array (I use tensorflow backend..)
[UPD] Given my implementation, there's now this question: I'm not sure whether there's a possibility to have the output of the model masked as well. Because if we don't care about the model's output for the 'pad' input positions (they don't contribute to the loss anyway), there may as well be some random garbage in the output which will affect the F1 metric. It would be ideal to only have there zeros as well.