4

I'm trying to make a RBM with SSU hidden units, and I have to update the standar deviation.

I define my variable like this:

    def _build_model(self):
        with tf.device('/gpu:0'):
            with self.graph.as_default():
...
                with tf.variable_scope("visible_layer"):
                    self.v_clamp = tf.placeholder(name = "v_in", dtype = tf.float32, shape=[self.batch_size, self.n_visibles])
                    self.bv = tf.get_variable(name = "b_v", dtype = tf.float32, shape=[self.n_visibles], initializer=tf.random_uniform_initializer(maxval=0.01,minval=-0.01))

                self.stddev = tf.get_variable(name = "stddev", dtype = tf.float32, shape = [1], initializer = tf.constant_initializer(float(self.stddev_)))

...

                with tf.variable_scope("update_weights"):
                    self.optimizer = self.update_weights()

....

where stddev_ has the initial value.

My update function is like this:

    def update_weights(self):
        with self.graph.as_default():
            with tf.device('/gpu:0'):
            ...
            with tf.variable_scope("calc_deltas"):
            ...
                ##UPDATE STDDEV
                delta_stddev = tf.multiply((2)/(self.stddev**3),
                                           tf.subtract(tf.reduce_sum(tf.pow(tf.subtract(self.v_clamp,self.bv),2)),
                                                       tf.reduce_sum(tf.pow(tf.subtract(v_free,self.bv),2))))
            #self.stddev.assing_add(delta_stddev)
            self.stddev.assign_add(tf.constant(0.1,shape=[1]))

            return self.stddev

The lines that are commented are the things that I have tried.

And I train it like this:

    def train_model(self):
        with tf.Session(graph=self.graph) as session:
            session.run(tf.global_variables_initializer())#Now all variables should be initialized.
            print("Uninitialized variables: ", session.run(tf.report_uninitialized_variables())) #Just to check, should print nothing

            print("Training for ", self.n_steps)
            for step in range(self.n_steps):

                feed_train = self._create_feed_dict(self.X_train,step)
                feed_test = self._create_feed_dict(self.X_test,step)

                print(session.run(self.optimizer, feed_dict = {self.v_clamp: feed_train}))

The thing is that the other variables, which are vectors (like self.bv) are updated correctly, but this one (stddev) always is equal to the initial value.

I don't know, what I am doing wrong

Isaac
  • 1,436
  • 2
  • 15
  • 29

1 Answers1

5

This is because you haven't run the assign_add operation you defined in the TensorFlow graph when you called the tf.assign_add method.

import tensorflow as tf
v = tf.get_variable('t', shape=[], initializer=tf.constant_initializer(0.))

op = tf.assign_add(v, 1)

with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print(session.run(v)) # print 0. 
    print(session.run(op)) # print 1. as you just ran the `assign_add` operation
    print(session.run(v)) # print 1. as `v` has been incremented.

Edit:

In your case, what you could do is to have:

def update_weights(self):
    ...
    return self.stddev.assing_add(delta_stddev)

This way, your method will return the op which actually updates your self.stdv variable.

pfm
  • 6,210
  • 4
  • 39
  • 44
  • Thanks for answer! But I don't understand. Am I not running it when I call the optimizer? – Isaac Feb 12 '18 at 17:28
  • 1
    Actually no: you only defined the op in the TensorFlow graph when you ran `self.stddev.assign_add(tf.constant(0.1,shape=[1]))` but never ran explicitly ran it. What you could do is to have `update_weights` returning `self.stddev.assing_add(delta_stddev)`. – pfm Feb 12 '18 at 17:48
  • This is Tensorflow 2.0 but I am seeing something similar there. The `assign_add()` does not seem to have any effect. See https://stackoverflow.com/questions/68121006/why-are-these-gradient-accumulation-implementations-not-working – Stefan Falk Jun 25 '21 at 07:55