13

In my problem I need run GD with 1 example from data on each training step. It's known problem that session.run() has overhead and therefore it is too long to train model. In attempt to avoid overhead I tried to use while_loop and train model on all data with one run() call. But it approach don't work and train_op don't execute even ones. Below simple example of what I'm doing:

data = [k*1. for k in range(10)]
tf.reset_default_graph()

i = tf.Variable(0, name='loop_i')
q_x = tf.FIFOQueue(100000, tf.float32)
q_y = tf.FIFOQueue(100000, tf.float32)

x = q_x.dequeue()
y = q_y.dequeue()
w = tf.Variable(0.)
b = tf.Variable(0.)
loss = (tf.add(tf.mul(x, w), b) - y)**2

gs = tf.Variable(0)

train_op = tf.train.GradientDescentOptimizer(0.05).minimize(loss, global_step=gs)

s = tf.Session()
s.run(tf.initialize_all_variables())

def cond(i):
    return i < 10

def body(i):
    return tf.tuple([tf.add(i, 1)], control_inputs=[train_op])


loop = tf.while_loop(cond, body, [i])

for _ in range(1):
    s.run(q_x.enqueue_many((data, )))
    s.run(q_y.enqueue_many((data, )))

s.run(loop)
s.close()

What I'm doing wrong? Or there is another solution of this problem with too expensive overhead?

Thanks!

Andrey Atanov
  • 243
  • 2
  • 12

1 Answers1

19

The reason the model does not appear to train is because the input reading, gradient calculation, and the minimize() call are all defined outside (and hence, in dataflow terms, before) the body of the tf.while_loop(). This means that all of these parts of the model run only once, before the loop executes, and the loop itself has no effect.

A slight refactoring—to move the dequeue() operations, gradient calculation, and minimize() call inside the loop—fixes the problem and allows your program to train:

optimizer = tf.train.GradientDescentOptimizer(0.05)

def cond(i):
    return i < 10

def body(i):
    # Dequeue a new example each iteration.
    x = q_x.dequeue()
    y = q_y.dequeue()

    # Compute the loss and gradient update based on the current example.
    loss = (tf.add(tf.mul(x, w), b) - y)**2
    train_op = optimizer.minimize(loss, global_step=gs)

    # Ensure that the update is applied before continuing.
    return tf.tuple([tf.add(i, 1)], control_inputs=[train_op])

loop = tf.while_loop(cond, body, [i])

UPDATE: Here's a complete program the executes the while loop, based on the code in your question:

import tensorflow as tf

# Define a single queue with two components to store the input data.
q_data = tf.FIFOQueue(100000, [tf.float32, tf.float32])

# We will use these placeholders to enqueue input data.
placeholder_x = tf.placeholder(tf.float32, shape=[None])
placeholder_y = tf.placeholder(tf.float32, shape=[None])
enqueue_data_op = q_data.enqueue_many([placeholder_x, placeholder_y])

gs = tf.Variable(0)
w = tf.Variable(0.)
b = tf.Variable(0.)
optimizer = tf.train.GradientDescentOptimizer(0.05)

# Construct the while loop.
def cond(i):
    return i < 10

def body(i):
    # Dequeue a single new example each iteration.
    x, y = q_data.dequeue()
    # Compute the loss and gradient update based on the current example.
    loss = (tf.add(tf.multiply(x, w), b) - y) ** 2
    train_op = optimizer.minimize(loss, global_step=gs)
    # Ensure that the update is applied before continuing.
    with tf.control_dependencies([train_op]):
        return i + 1

loop = tf.while_loop(cond, body, [tf.constant(0)])

data = [k * 1. for k in range(10)]

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for _ in range(1):
        # NOTE: Constructing the enqueue op ahead of time avoids adding
        # (potentially many) copies of `data` to the graph.
        sess.run(enqueue_data_op,
                 feed_dict={placeholder_x: data, placeholder_y: data})
    print (sess.run([gs, w, b]))  # Prints before-loop values.
    sess.run(loop)
    print (sess.run([gs, w, b]))  # Prints after-loop values.
Mr_and_Mrs_D
  • 32,208
  • 39
  • 178
  • 361
mrry
  • 125,488
  • 26
  • 399
  • 400
  • 1
    Should I define **w** and **b** outside? So I'm trying something like that (and now I try exactly what you offer) but I got error *All inputs to node while/GradientDescent/update_while/w/ApplyGradientDescent must be from the same frame.* – Andrey Atanov Aug 17 '16 at 14:34
  • I added the full program that I ran with TensorFlow 0.10rc0. (You might need to upgrade; there have been various bugs in the `tf.while_loop()` implementation that were fixed over the last few releases. – mrry Aug 17 '16 at 15:45
  • Yes, I launched it on 0.9, thank you, after update it work! There is one more question about your solution - it's look like new optimizer creates every step, and what if I want to use Ftrl optimizer (which has some updated slots)? Will it work like one optimizer along training process? – Andrey Atanov Aug 18 '16 at 08:47
  • Glad to hear it! In fact, the optimizer (and its slots) is only created once, but that's not obvious from the looking at the program (and is in some sense an implementation detail - at present if you create a `tf.Variable` (such as an optimizer slot) inside a loop, it has the same behavior as creating the variable outside the loop and referring to it inside). It works the same if you construct the optimizer inside of or before the loop, and I updated the answer to construct it before the loop, which looks more intuitively correct. – mrry Aug 18 '16 at 16:52
  • Even with delay.. Thank you for explanation, it became clearer ! – Andrey Atanov Aug 22 '16 at 09:03
  • Great example, I couldn't find anything like it elsewhere! This should be in the TensorFlow docs. – egpbos Oct 19 '16 at 12:10