What are the exact input parameters for tensorflow CTC-loss function (tf.nn.ctc_loss)?

Question

class CTCLoss(keras.losses.Loss):

 def __init__(self, logits_time_major=False, blank_index=-1, 
              reduction=keras.losses.Reduction.AUTO, name='ctc_loss'):
     super().__init__(reduction=reduction, name=name)
     self.logits_time_major = logits_time_major
     self.blank_index = blank_index

 def call(self, y_true, y_pred):
     y_true = tf.cast(y_true, tf.int32)
     y_true = tf.reshape(y_true,  [batch_size, max_label_seq_length])
     y_pred = tf.reshape(y_pred, [frames, batch_size, num_labels])
     loss = tf.nn.ctc_loss(
         labels=y_true,
         logits=y_pred,
         label_length=4480,
         logit_length=4480)
     return tf.reduce_mean(loss)

model = Sequential()
model.add(Bidirectional(LSTM(35, input_shape=X_train.shape, return_sequences=True)))
# didn't add the hidden layers in this code snippet. 
model.add(Flatten())
model.add(Dense((4480), activation='softmax'))

model.compile(optimizer='adam',
           loss=CTCLoss(),
           metrics=['accuracy'])

I am trying to solve the online handwriting recognition problem and I am trying to use CTC loss function for the same. I tried using this class as my CTC loss function in the code above. But there is an error regarding the dimensions that is being thrown. Can someone please explain what each of these parameters are? Especially what 'frames' in [frames, batch_size, num_labels] means. Please let me know where I am going wrong in this particular code. My X_train has the shape of (1311, 919, 3). Thanks.

What are the exact input parameters for tensorflow CTC-loss function (tf.nn.ctc_loss)?

0 Answers0