Feeding LSTMCell with whole sentences using embeddings gives dimensionality error

Question

So currently i'm sitting on a text-classification problem, but i can't even set up my model in Tensorflow. I have a batch of sentences of length 70 (using padding) and i'm using a embedding_lookup with an embedding size of 300. Here the code for the embedding:

embedding = tf.constant(embedding_matrix, name="embedding")
inputs = tf.nn.embedding_lookup(embedding, input_.input_data)

So now inputs should be of shape [batch_size, sentence_length, embedding_size] which is not surprising. Now sadly i'm getting a ValueError for my LSTMCell since it is expecting ndim=2 and obviously inputs is of ndim=3. I have not found a way to change the expected input shape of the LSTM Layer. Here is the code for my LSTMCell init:

for i in range(num_layers):
    cells.append(LSTMCell(num_units, forget_bias, state_is_tuple, reuse=reuse, name='lstm_{}'.format(i))
cell = tf.contrib.rnn.MultiRNNCell(cells, state_is_tuple=True)

The error is triggered in the call function of the cell, that looks like this:

for step in range(self.num_steps):
    if step > 0: tf.get_variable_scope().reuse_variables()
    (cell_output, state) = cell(inputs[:, step, :], state)

Similar question but not helping: Understanding Tensorflow LSTM Input shape

score 0 · Accepted Answer · answered Jan 06 '19 at 13:19

I could solve the problem myself. As it seems, the LSTMCell implementation is more hands on and basic in relation to how a LSTM actually works. The Keras LSTM Layers took care of stuff i need to consider when i'm using TensorFlow. The example i'm using is from the following official TensorFlow example:

https://github.com/tensorflow/models/tree/master/tutorials/rnn/ptb

As we want to feed our LSTM Layers with a sequence, we need to feed the cells each word after another. As the call of the Cell creates two outputs (cell output and cell state), we use a loop for all words in all sentences to feed the cell and reuse our cell states. This way we create the output for our layers, which we can then use for further operations. The code for this looks like this:

self._initial_state = cell.zero_state(config.batch_size, data_type())
state = self._initial_state
outputs = []
with tf.variable_scope("RNN"):
  for time_step in range(self.num_steps):
    if time_step > 0: tf.get_variable_scope().reuse_variables()
    (cell_output, state) = cell(inputs[:, time_step, :], state)
    outputs.append(cell_output)
output = tf.reshape(tf.concat(outputs, 1), [-1, config.hidden_size])

num_steps represents the amount of words in our sentence, that we are going to use.

Feeding LSTMCell with whole sentences using embeddings gives dimensionality error

1 Answers1