7

I am training an Inception-like model using TensorFlow r1.0 with GPU Nvidia Titan X.

I added some summary operations to visualize the training procedure, using code as follows:

def variable_summaries(var):
"""Attach a lot of summaries to a Tensor (for TensorBoard visualization)."""
    with tf.name_scope('summaries'):
        mean = tf.reduce_mean(var)
        tf.summary.scalar('mean', mean)
        with tf.name_scope('stddev'):
            stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
        tf.summary.scalar('stddev', stddev)
        tf.summary.scalar('max', tf.reduce_max(var))
        tf.summary.scalar('min', tf.reduce_min(var))
        tf.summary.histogram('histogram', var)

When I run these operations, the time cost of training one epoch is about 400 seconds. But when I turn off these operations, the time cost of training one epoch is just 90 seconds.

How to optimize the graph to minimize the summary operations time cost?

Misha Brukman
  • 12,938
  • 4
  • 61
  • 78
Da Tong
  • 2,018
  • 18
  • 25
  • 2
    maybe compute summaries less often? Also, TF 1.0 refactors things to make things more efficient -- when using hooks, summaries are computed at the same time as other tensors, so all the intermediate quantities are reused – Yaroslav Bulatov Feb 23 '17 at 04:12
  • I am using TF 1.0. Could you please make it more clear how to use hooks? I tried to use CPU to compute summaries, but it did not help much. I guess it is because of the data transfer between GPU and CPU. @YaroslavBulatov – Da Tong Feb 23 '17 at 05:30
  • before moving to hooks, can you just reduce the number of times you compute summaries? – Yaroslav Bulatov Feb 23 '17 at 17:05
  • Oh, yes, of course I can. But actually, I just compute the summaries every epoch, not every batch. If I reduce the summaries to every 10 epochs, I am afraid that I will lose some information of training procedure. – Da Tong Mar 02 '17 at 01:28

1 Answers1

2

Summaries of course slow down the training process, because you do more operations and you need to write them to disc. Also, histogram summaries slow the training even more, because for histograms you need more data to be copied from GPU to CPU than for scalar values. So I would try to use histogram logging less often than the rest, that could make some difference.

The usual solution is to compute summaries only every X batches. Since you compute summaries only one per epoch and not every batch, it might be worth trying even less summaries logging.

Depends on how many batches you have in your dataset, but usually you don't lose much information by gathering a bit less logs.

Matěj Račinský
  • 1,679
  • 1
  • 16
  • 28
  • Is there a way of keeping histograms on GPU and just copying back for logging every some epochs, but still keeping the full logging data? – Gulzar Feb 13 '21 at 08:06