1

I am building a handwriting recognition model which currently has 88% validation accuracy. I came across this github page which can help the model achieve more accurate predictions using a dictionary.

The problem is I don't know how to implement this in my current model. This is my current ctc function which is copied from a keras tutorial. How can I modify this to add a dictionary?

class CTCLayer(keras.layers.Layer):
    def __init__(self, name=None):
        super().__init__(name=name)
        self.loss_fn = keras.backend.ctc_batch_cost

    def call(self, y_true, y_pred):
        batch_len = tf.cast(tf.shape(y_true)[0], dtype="int64")
        input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
        label_length = tf.cast(tf.shape(y_true)[1], dtype="int64")

        input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
        label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")
        loss = self.loss_fn(y_true, y_pred, input_length, label_length)
        self.add_loss(loss)

        # At test time, just return the computed predictions.
        return y_pred

This is the implementation of word beam search on the original github page. To be specific my main problem is getting the loss from the function. And later on returning the loss to the ctc layer.


chars = ''.join(self.char_list)
word_chars = open('../model/wordCharList.txt').read().splitlines()[0]
corpus = open('../data/corpus.txt').read()

# decode using the "Words" mode of word beam search
from word_beam_search import WordBeamSearch
self.decoder = WordBeamSearch(50, 'Words', 0.0, corpus.encode('utf8'), chars.encode('utf8'),word_chars.encode('utf8'))

I tried looking at github pages that implement this in their projects but they seem to be using tensorflow v1 which is kind of confusing for me since I am a beginner on this field. Any response would be appreciated thank you.

1 Answers1

1

Word beam search is only a decoder and not a loss function. For loss, you still use the "standard" CTC loss that ships with Keras. This means in your training code you don't even have to think about word beam search.

Word beam search is applied in inference only. All you have to do is to convert the Tensors to numpy arrays, for details see documentation.

Harry
  • 1,105
  • 8
  • 20
  • Thanks @Harry for clarifying that wordbeamsearch is not a loss function. – Mendrix Manlangit Apr 14 '22 at 11:40
  • Follow up question. In keras ctc requires len of chars +2 for ctc blank characters according to this thread(https://git.io/J0eXP) and in the wordbeamsearch it requires +1 for len of chars. I tried adding characters to chars to fit the size but the answer is more inaccurate than the greedy search. How can I fix this? – Mendrix Manlangit Apr 15 '22 at 16:24
  • update: I tried using +1 for length of chars and it already works. Thanks @Harry for the input. – Mendrix Manlangit Apr 18 '22 at 13:47
  • @Harry any chance you can help out here https://stackoverflow.com/questions/75120076/practically-implementing-ctcloss – John Stud Jan 14 '23 at 22:11