2

I am trying to implement a deep neural network, where I want to experiment with the number of hidden layers. In order to avoid error-prone code repetition, I have placed the creation of the layers in a for-loop, as follows:

def neural_network_model(data, layer_sizes):
    num_layers = len(layer_sizes) - 1 # hidden and output layers
    layers = [] # hidden and output layers

    # initialise the weights
    for i in range(num_layers):
        layers.append({
            'weights': tf.get_variable("W" + str(i+1),
                       [layer_sizes[i], layer_sizes[i+1]], 
                       initializer = tf.contrib.layers.xavier_initializer()),
             'biases': tf.get_variable("b" + str(i+1), [layer_sizes[i+1]], 
                       initializer = tf.zeros_initializer())
        })
        ...

The list layer_sizes given as input looks something like this:

layer_sizes = [num_inputs, num_hl_1, num_hl_2, ..., num_hl_n, num_outputs]

When I ran this code for the first time I had no problems. However, when I changed layer_sizes to have a different number of layers, I got an error:

ValueError: Variable W1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope

I understand that this is because of the naming of the layers (which I don't even care about). How can I work around this and allow renaming when rerunning? I've done some googling and the solution seems to lie in the use of with tf.variable_scope(), but I can't figure out exactly how.

EDIT - Just to be clear: I do not want to reuse any names or variables. I just want to (re-)initialise the weights and biases every time neural_network_model is called.

rdv
  • 682
  • 2
  • 8
  • 19
  • Needs more context. Why does (e.g.) W1 already exist? The code you posted should only create each Wn once. Are you calling this function multiple times in one program? – xdurch0 Mar 01 '18 at 21:10
  • 1
    @xdurch0 Yes, the idea is to call this function repeatedly, each time giving it a different input. As I wrote, the problem arises not on the first call, but on the subsequent call(s). – rdv Mar 01 '18 at 21:25
  • 1
    Ok, now I get the issue. I editet and restored my answer (which missed the point initially). – xdurch0 Mar 01 '18 at 21:38

3 Answers3

1

When creating multiple distinct models, you need to make sure they all receive unique variable names. The most straightforward way I can see here would be something like this:

def neural_network_model(data, layer_sizes, name):
    num_layers = len(layer_sizes) - 1 # hidden and output layers
    layers = [] # hidden and output layers

    # initialise the weights
    for i in range(num_layers):
        with tf.variable_scope(name):
            layers.append({
                'weights': tf.get_variable("W" + str(i+1),
                           [layer_sizes[i], layer_sizes[i+1]], 
                           initializer = tf.contrib.layers.xavier_initializer()),
                'biases': tf.get_variable("b" + str(i+1), [layer_sizes[i+1]], 
                      initializer = tf.zeros_initializer())
            })
    ...

Note how there is an additional name argument to name your models. Then you could create multiple ones like

model1 = neural_network_model(data, some_layers, "model1")
model2 = neural_network_model(data, other_layers, "model2")

etc. The models will have variable names such as "model1/W0". Note that you can also use variable_scope to name the parameters for the different layers. I.e. instead of using names such as "W" + str(i) you could wrap a tf.variable_scope("layer" + str(i)) around get_variable. This would give you names such as "model1/layer0/W". Scopes can be nested arbitrarily.

You might want to read the TF Programmer's Guide on variables.

xdurch0
  • 9,905
  • 4
  • 32
  • 38
  • This works, but only if I use each name only once. Once I repeat them, I get that same ValueError. This is quite annoying - I don't even care about that name, I never use it. And cluttering the code with extra variables is not something I like either. – rdv Mar 01 '18 at 22:56
  • In the Programmer's Guide I read "tf.get_variable also allows you to reuse a previously created variable of the same name, making it easy to define models which reuse layers." - but it doesn't say how (or I am somehow missing that). – rdv Mar 01 '18 at 23:11
1

If you do not wish to reuse the variables, than you should not be using tf.get_variable. A simple tf.Variable should work and not have the conflict you are seeing.

You can see this page in the tensorflow documentation for more: their first example explains that an entirely new set of variables will be created when the example function is called again. They then explain how to avoid this, but it seems that in this case that is exactly what you want.

mbrig
  • 929
  • 12
  • 16
  • `tf.Variable` is what I used first, but I want to use Xavier initialization and I can't figure out how to do that there. – rdv Mar 01 '18 at 23:55
  • 1
    @rdv Take a look at [this](https://stackoverflow.com/a/45380994/5116726) – mbrig Mar 02 '18 at 15:24
1

I think found the most simple solution. It turns out that I had this problem because I used a Jupyter notebook, in which all variables stay alive as long as I don't restart the kernel.

My solution is to simply re-initialise all variables before calling anything:

tf.reset_default_graph()
rdv
  • 682
  • 2
  • 8
  • 19