I am trying to use the CTC loss function in my network, but don't quite understand when to feed the 'blank' label as a label.
I use it in gesture recognition as described byMolchanov, but what get's me confused that there is a 'no gesture' as well.
In tensorflow docs, it is described that
The inputs Tensor's innermost dimension size, num_classes, represents num_labels + 1 classes, where num_labels is the number of true labels, and the largest value (num_classes - 1) is reserved for the blank label.
If I now use the 'blank' label, to indicate that there is no gesture, I am limited in my training, because of the error
Saw a non-null label (index >= num_classes - 1) following a null label
I am assuming that null label is the same as the blank label.
The problem is, when I want to feed data that starts with no gesture (mapped to null label) and has then a gesture, I get exactly this error. I can avoid it by adding two more labels, one for 'no gesture' and one for 'blank label/null label' next to my existing labels. Then I only feed the 'no gesture' label but never the 'blank' label, but this doesn't seem quite right.
So my question is, what should I use the 'blank/null' label for?
I can imagine in language processing, you would use the sentence ending dot usually as the 'null' label? But there is no ending gesture as it is one continuous stream.
Thank you