In the code below, there are a number of tensor operations and calculations. I'd like to see the results of some of those calculations so I can better understand them. Specifically I'd like to see what h looks like during graph execution using print(Session.Run(h))
. However, the calculations are dependent on the placeholder X. So in order to see them I need to use a feed dictionary.
I have read through this SO question: How to feed a placeholder? and several others. I still don't know what I should be feeding into this placeholder.
To see the value of h, how, or rather what am I supposed to put in the feed dictionary when trying to print it?
def expand_tile(value, size):
"""Add a new axis of given size."""
value = tf.convert_to_tensor(value, name='value')
ndims = value.shape.ndims
return tf.tile(tf.expand_dims(value, axis=0), [size] + [1]*ndims)
def positions_for(tokens, past_length):
batch_size = tf.shape(tokens)[0]
nsteps = tf.shape(tokens)[1]
return expand_tile(past_length + tf.range(nsteps), batch_size)
def model(hparams, X, past=None, scope='model', reuse=tf.AUTO_REUSE):
with tf.variable_scope(scope, reuse=reuse):
results = {}
batch_size = 1
X = tf.placeholder(tf.int32, [batch_size, None])
batch, sequence = shape_list(X)
wpe = tf.get_variable('wpe', [1024, 768],
initializer=tf.random_normal_initializer(stddev=0.01))
wte = tf.get_variable('wte', [50256, 768],
initializer=tf.random_normal_initializer(stddev=0.02))
past_length = 0 if past is None else tf.shape(past)[-2]
h = tf.gather(wte, X) + tf.gather(wpe, positions_for(X, past_length))