I would like to create a Seq2Seq model to forecast time series data. I am using the InferenceHelper and I am struggling with the sample_fn
parameter. I would like to pass the decoder output of each cell through a dense layer in order to generate a single output at each time step. So I'm providing a function that does this to the sample_fn
parameter.
Later on I would like to concatenate the rnn cell outputs with other non-time-series features and build more dense layers on top of it.
The network does fine at training time but not during inference. I think this is caused by the fact that I'm not sharing the same dense layer between training and inference time.
I tried to set the reuse parameter and used a with tf.variable_scope()
environment. However, the sample_fn
is already called within a specific scope in dynamic_decode
and so I fail to use the same scope as I did during training.
The relevant part of my code looks as follows:
The placeholders:
inputs = tf.placeholder(shape=(None, 100, 1), dtype=tf.float32, name='inputs')
input_lengths = tf.placeholder(shape=(None,), dtype=tf.int32, name='input_lengths')
targets = tf.placeholder(shape=(None, 100), dtype=tf.float32, name='targets')
target_lengths = tf.placeholder(shape=(None,), dtype=tf.int32, name='target_lengths')
The encoder:
encoder_cell = tf.nn.rnn_cell.MultiRNNCell([tf.contrib.rnn.GRUCell(num_units=16, name='encoder_cell_0'])
self.decoder_cell = tf.nn.rnn_cell.MultiRNNCell([tf.contrib.rnn.GRUCell(num_units=16, name='decoder_cell_0']))
_, final_encoder_states = tf.nn.dynamic_rnn(cell=encoder_cell, inputs=inputs,
sequence_length=input_lengths, dtype=tf.float32)
The decoder (training)
start_tokens = tf.fill([tf.shape(inputs)[0]], start_token)
start_tokens = tf.cast(tf.expand_dims(start_tokens, 1), dtype=tf.float32)
targets_as_inputs = tf.concat([start_tokens, targets], axis=1)
targets_as_inputs = tf.reshape(targets_as_inputs, (-1, targets_as_inputs.shape[1], 1))
training_helper = tf.contrib.seq2seq.TrainingHelper(inputs=targets_as_inputs, sequence_length=target_lengths, name='training_helper')
training_decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=training_helper, initial_state=final_encoder_states)
train_outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder=training_decoder, maximum_iterations=max_target_sequence_length, impute_finished=True)
train_predictions = train_outputs.rnn_output
train_predictions = tf.layers.dense(train_predictions, 1, activation=None, name='output_dense_layer')
The decoder (inference). The incorrect part:
def sample_fn(outputs):
return tf.layers.dense(outputs, 1, activation=None,
name='output_dense_layer', reuse=tf.AUTO_REUSE)
infer_helper = tf.contrib.seq2seq.InferenceHelper(sample_fn=sample_fn, sample_shape=(1),
sample_dtype=tf.float32, start_inputs=start_tokens, end_fn=lambda sample_ids: False, next_inputs_fn=None)
infer_decoder = tf.contrib.seq2seq.BasicDecoder(cell=decoder_cell, helper=infer_helper, initial_state=final_encoder_states)
infer_outputs, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder=infer_decoder, maximum_iterations=max_target_sequence_length, impute_finished=True)
infer_predictions = infer_outputs.rnn_output
infer_predictions = sample_fn(infer_predictions)
There is a similar question: How to use tensorflow seq2seq without embeddings?
The author uses sample_fn=lambda outputs: outputs
. But this returns a ValueError in my case because the dimensions don't match. How could they with multiple cells? sample_fn
should return a single value.