0

Code:

import tensorflow as tf
from tensorflow.python.ops import rnn_cell

cell = rnn_cell.LSTMCell(64, state_is_tuple=True)
multi_layer_cell = tf.nn.rnn_cell.MultiRNNCell([cell for i in range(2)])
x = tf.placeholder("float", [None, 10, 1])
output, state = tf.nn.dynamic_rnn(multi_layer_cell, x, dtype = tf.float32)

Error:

ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/lstm_cell/kernel, but specified shape (128, 256) and found shape (65, 256).

Versions: Tensorflow 1.2.1 Python 3.5.4

The variants here don't seem to work: ValueError: Trying to share variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/kernel

peractio
  • 593
  • 1
  • 5
  • 14

1 Answers1

0

The problem is that you are making a list of the same object repeated twice with

multi_layer_cell = tf.nn.rnn_cell.MultiRNNCell([cell for i in range(2)])

cell doesn't merely specify parameters for a multi-layer cell, the objects are used directly. However, for your neural network you first cell will map inputs of size 1 to 64 and then your second cell will map 64 to 64.

The shapes you see are the sizes of the kernels for each cell, at least what they should be. LSTM kernels can be viewed as size n + m x 4m where n is the input size and m is the state size. The factor of 4 comes from the fact that there are 4 gates which require matrix weights. The n + m comes from stacking the input -> gate transition on top of the state -> gate transition. For example in your first cell, n = 1 and m = 64 so you see the size (65, 256), which obviously won't work for your second cell which requires a kernel of size (128, 256) (because 64 + 64 = 128 not 65).

To fix this, simply make two different cell objects:

cell_1 = rnn_cell.LSTMCell(64, state_is_tuple=True)
cell_2 = rnn_cell.LSTMCell(64, state_is_tuple=True)
multi_layer_cell = tf.nn.rnn_cell.MultiRNNCell([cell_1,cell_2])
Cory Nezin
  • 1,551
  • 10
  • 22