Unknown variables in tensorflow state causes error in training operation

Question

I created two tensors (one depending upon another) as follows:

weights = tf.random_normal(shape=(3, 3, 1, 64))
filters = get_filters(weights)  # get_filters does some operation on weights

So, after above operations, weights and filters looks like

<tf.Tensor 'random_normal_1:0' shape=(3, 3, 1, 64) dtype=float32>
<tf.Tensor 'filters_1/weights:0' shape=(5, 3, 3, 1, 64) dtype=float32>

Now, I pass these tensors to the following function

def get_alphas(weights, filters, no_filters=5,
               epochs=500, name=None):
    with tf.name_scope(name, default_name="alpha_scope"):
        weights = tf.reshape(weights, [-1], name="reshaped_weights")
        filters = tf.reshape(filters, [no_filters, -1], name="reshaped_binary_filters")
        alphas = tf.Variable(tf.zeros(shape=(no_filters, 1)), name="alphas")
        weighted_sum = tf.reduce_sum(tf.multiply(alphas, filters), axis=0, name="weighted_sum")
        error = tf.square(weights - weighted_sum, name="error")
        loss = tf.reduce_mean(tf.reshape(error, [-1]), name="loss")

        # Optimizer
        optimizer = tf.train.AdamOptimizer()
        training_op = optimizer.minimize(loss, name="training_op")
        print(tf.global_variables())
        init = tf.variables_initializer([alphas])
        with tf.Session() as sess:
            init.run()
            epoch = 0
            while epoch < epochs:
                _, loss_train = sess.run([training_op, loss])  # <-- this is where the error is generated

                print("\rIteration: {}/{} ({:.1f}%)  Loss: {:.5f}".format(
                      epoch+1, epochs,
                      epoch * 100 / epochs,
                      loss_train),
                  end="")
                epoch += 1
            return tf.convert_to_tensor(sess.run(alphas))

On calling get_alphas(weights, filters), I get the following errors

FailedPreconditionError (see above for traceback): Attempting to use uninitialized value alpha_scope/beta1_power
     [[Node: alpha_scope/beta1_power/read = Identity[T=DT_FLOAT, _class=["loc:@alpha_scope/alphas"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](alpha_scope/beta1_power)]]
     [[Node: alpha_scope/loss/_1 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_115_alpha_scope/loss", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

So, I print all the variables in tensorflow using tf.global_variables() and there are some unknown variables (beta1_power, beta2_power) that I didn't define and that's what causing this error

[<tf.Variable 'alpha_scope/alphas:0' shape=(5, 1) dtype=float32_ref>,
<tf.Variable 'alpha_scope/beta1_power:0' shape=() dtype=float32_ref>,
<tf.Variable 'alpha_scope/beta2_power:0' shape=() dtype=float32_ref>,
<tf.Variable 'alpha_scope/alphas/Adam:0' shape=(5, 1) dtype=float32_ref>,
<tf.Variable 'alpha_scope/alphas/Adam_1:0' shape=(5, 1) dtype=float32_ref>]

Any ideas, how these variables are being created? or how to initialize them? I cannot use tf.global_variables_initializer() as it might reset some of the variables that might be in state.

score 2 · Accepted Answer · answered Jan 15 '18 at 18:40

2

These variables are coming from tf.train.AdamOptimizer (see this question). Since you did

init = tf.variables_initializer([alphas])
...
init.run()

... you've initialized just alphas and not the slots from AdamOptimizer. If you can't use tf.global_variables_initializer(), you'll have to get all those variables manually by name and initialize all of them.

answered Jan 15 '18 at 18:40

Maxim

52,561
27
155
209

I changed the optimizer to `GradientDescentOptimizer` and now it works and will try by initializing variables of `AdamOptimizer` as well. As a side question, why `tf.local_variables_initializer` won't help? – layog Jan 16 '18 at 03:32
And manual variables initialization of `AdamOptimizer` works. Thanks! – layog Jan 16 '18 at 03:52

Unknown variables in tensorflow state causes error in training operation

1 Answers1