3

I want to implement the autoencoder (to be exact stacked convolutional autoencoder)

here I'd like to pretrain each layer first and then fine-tuning

So I created variables for weight of each layer

ex. W_1 = tf.Variable(initial_value, name,trainable=True etc) for first layer

and I pretrained W_1 of first layer

Then I want to pretrain weight of second layer (W_2)

Here I should use W_1 for calculating input of second layer.

However W_1 is trainable so if I use W_1 directly then tensorflow may train W_1 together.

So I should create W_1_out that keep value of W_1 but not trainable

To be honest I tried to modify code of this site

https://github.com/cmgreen210/TensorFlowDeepAutoencoder/blob/master/code/ae/autoencoder.py

At line 102 it creates variable by following code

self[name_w + "_fixed"] = tf.Variable(tf.identity(self[name_w]),
                                            name=name_w + "_fixed",
                                            trainable=False)

However it calls error cause it use uninitialized value

How should I do to copy variable but make it not trainable to pretrain next layers??

Jaeyoon Yoo
  • 171
  • 1
  • 2
  • 9
  • See [enter link description here](https://stackoverflow.com/questions/37326002/tensorflow-get-variable-change-shared-variable-trainable-to-false) – educob Jun 06 '16 at 12:43

1 Answers1

3

Not sure if still relevant, but I'll try anyway.

Generally, what I do in a situation like that is the following:

  • Populate the (default) graph according to the model you are building, e.g. for the first training step just create the first convolutional layer W1 you mention. When you train the first layer you can store the saved model once training is finished, then reload it and add the ops required for the second layer W2. Or you can just build the whole graph for W1 from scratch again directly in the code and then add the ops for W2.

  • If you are using the restore mechanism provided by Tensorflow, you will have the advantage that the weights for W1 are already the pre-trained ones. If you don't use the restore mechanism, you will have to set the W1 weights manually, e.g. by doing something shown in the snippet further below.

  • Then when you set up the training op, you can pass a list of variables as var_list to the optimizer which explicitly tells the optimizer which parameters are updated in order to minimize the loss. If this is set to None (the default), it just uses what it can find in tf.trainable_variables() which in turn is a collection of all tf.Variables that are trainable. May be check this answer, too, which basically says the same thing.
  • When using the var_list argument, graph collections come in handy. E.g. you could create a separate graph collection for every layer you want to train. The collection would contain the trainable variables for each layer and then you could very easily just retrieve the required collection and pass it as the var_list argument (see example below and/or the remark in the above linked documentation).

How to override the value of a variable: name is the name of the variable to be overriden, value is an array of the appropriate size and type and sess is the session:

variable = tf.get_default_graph().get_tensor_by_name(name)
sess.run(tf.assign(variable, value))

Note that the name needs an additional :0 in the end, so e.g. if the weights of your layer are called 'weights1' the name in the example should be 'weights1:0'.

To add a tensor to a custom collection: Use something along the following lines:

tf.add_to_collection('layer1_tensors', weights1)
tf.add_to_collection('layer1_tensors', some_other_trainable_variable)

Note that the first line creates the collection because it does not yet exist and the second line adds the given tensor to the existing collection.

How to use the custom collection: Now you can do something like this:

# loss = some tensorflow op computing the loss
var_list = tf.get_collection_ref('layer1_tensors')
optim = tf.train.AdamOptimizer().minimize(loss=loss, var_list=var_list)

You could also use tf.get_collection('layer_tensors') which would return you a copy of the collection.

Of course, if you don't wanna do any of this, you could just use trainable=False when creating the graph for all variables you don't want to be trainable as you hinted towards in your question. However, I don't like that option too much, because it requires you to pass in booleans into the functions that populate your graph, which is very easily overlooked and thus error-prone. Also, even if you decide to it like that, you would still have to restore the non-trainable variables manually.

Graham
  • 7,431
  • 18
  • 59
  • 84
kafman
  • 2,862
  • 1
  • 29
  • 51