Using Tensorflow TF-Slim without slim.learning.train()

Question

TF-Slim uses its own training loop. However, I would like to use a TF-Slim model (Resnet50) while still using my own tensorflow training loop. The TF-Slim model simply outputs the prediction and I calculate my own total loss. I am able to train the model without error and training error seems to converge. I am asking because I experienced issues with batch normalization during evaluation (the error is very high compared to the training error). I found out that this might be due to insufficient training steps. But I want to make sure that I am not using TF-Slim incorrectly.

The TF-Slim training procedure looks like this:

#create_train_op ensures that each time we ask for the loss, the 
update_ops
# are run and the gradients being computed are applied too.

train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600):

I don't want to use the train_opbut something like this

def update_gradients(update_ops, optimizer, total_loss, variables_to_train, global_step, summaries):
  for grad, var in gradients:
      if grad is not None:
        summaries.add(tf.summary.histogram(var.op.name + '/gradients', grad))
  grad_updates = optimizer.apply_gradients(gradients,
                                         global_step=global_step)
  update_ops.append(grad_updates)

  update_op = tf.group(*update_ops)
  with tf.control_dependencies([update_op]):
      train_tensor = tf.identity(total_loss, name='train_op')
      return train_tensor

and then call sess.run(train_tensor)

Does this cause any issues internally? I read here that one should use train_op: github issues

Or is it simply not allowed to pass, for example, train_tensor into the slim.learning.train() function directly?

You can try reading into the source code of `slim.learning.train` and dissect the function for whatever parts you need. It should technically be the same. — kwotsin, Jun 09 '17 at 06:03
There are several examples of what you are trying to achieve in this walkthrough, hope it helps you: https://github.com/tensorflow/models/blob/master/slim/slim_walkthrough.ipynb — blacatus, Aug 01 '17 at 15:38

score 0 · Answer 1 · answered Nov 15 '17 at 05:19

0

I think you can try to override the train_step_fn in the parameter-list of slim.learning.train() to realize it

answered Nov 15 '17 at 05:19

yx luo

1
1

Using Tensorflow TF-Slim without slim.learning.train()

1 Answers1