3

Trying to run a beam search in a Keras model, I get confusing (and conflicting?) error messages. My model has inputs such as

inputs = Input(name='spectrograms',
               shape=(None, hparams["n_spectrogram"]))
input_length = Input(name='len_spectrograms',
                     shape=[1], dtype='int64')

and the CTC loss function requires the [1] shapes in input and label length. As far as I understand, the output should be obtained with something like

# Stick connectionist temporal classification on the end of the core model
paths = K.function(
    [inputs, input_length],
    K.ctc_decode(output, input_length, greedy=False, top_paths=4)[0])

but as-is, that leads to a complain about the shape of input_length

ValueError: Shape must be rank 1 but is rank 2 for 'CTCBeamSearchDecoder' (op: 'CTCBeamSearchDecoder') with input shapes: [?,?,44], [?,1].

but if I chop off that dimension

    K.ctc_decode(output, input_length[..., 0], greedy=False, top_paths=4)[0])

the model definition runs, but when I run y = paths([x, numpy.array([[30], [30]])]) with a x.shape == (2, 30, 513) I suddenly get

tensorflow.python.framework.errors_impl.InvalidArgumentError: sequence_length is not a vector
     [[{{node CTCBeamSearchDecoder}} = CTCBeamSearchDecoder[beam_width=100, merge_repeated=true, top_paths=4, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Log, ToInt32)]]

What am I doing wrong?

Anaphory
  • 6,045
  • 4
  • 37
  • 68

0 Answers0