keras - embedding layer mask_zero causing exception at subsequent layers

Question

I am working on a model based on this paper and I am getting an exception due to GlobalMaxPooling1D layer not supporting masking.

I have an Embedding layer with mask_zero argument set to True. However, since a subsequent GlobalMaxPooling1D layer does not support masking, I am getting an exception. The exception is expected, as it is actually stated in the documentation of the Embedding layer that any subsequent layers after an Embedding layer with mask_zero = True should support masking.

However, as I am processing sentences with variable number of words in them, I do need the masking in the Embedding layer. (i.e. due to the varying length of input) My question is, how should I alter my model that masking remains a part of the model, and does not cause a problem at GlobalMaxPooling1D layer?

Below is the code for the model.

model = Sequential()
embedding_layer = Embedding(dictionary_size, num_word_dimensions,
                            weights=[embedding_weights], mask_zero=True,
                            embeddings_regularizer=regularizers.l2(0.0001))
model.add(TimeDistributed(embedding_layer,
                          input_shape=(max_conversation_length, timesteps)))

model.add(TimeDistributed(Bidirectional(LSTM(m // 2, return_sequences=True,
                                             kernel_regularizer=regularizers.l2(0.0001)))))
model.add(TimeDistributed(Dropout(0.2)))
model.add(TimeDistributed(GlobalMaxPooling1D()))
model.add(Bidirectional(LSTM(h // 2, return_sequences = True,
                             kernel_regularizer=regularizers.l2(0.0001)),
                        merge_mode='concat'))
model.add(Dropout(0.2))
crf = CRF(num_tags, sparse_target=False, kernel_regularizer=regularizers.l2(0.0001))
model.add(crf)
model.compile(optimizer, loss = crf.loss_function, metrics=[crf.accuracy])

score 2 · Accepted Answer · answered Apr 17 '19 at 01:09

However, as I am processing sentences with variable number of words in them, I do need the masking in the Embedding layer.

Are you padding the sentences to make them have equal lengths? If so, then instead of masking, you can let the model find out on its own that the 0 is padding and therefore should be ignored. Therefore, you would not need an explicit masking. This approach is also used for dealing with missing values in the data as suggested in this answer.

keras - embedding layer mask_zero causing exception at subsequent layers

1 Answers1