4

I am trying to use the official ResNet model benchmarks from https://github.com/tensorflow/models/blob/master/official/resnet/estimator_benchmark.py#L191 to experiment with the AMP support included in tensorflow-gpu==1.14.0rc0. I'm running on a 2080 Ti, driver 410.78, CUDA 10, Ubuntu.

I have made the following changes to help make sure the comparisons are quick and apples-to-apples:

  • Reduced epochs to 10.
  • Removed the 2x larger batch size for the tweaked runs so that everything is training on the same number of samples.
  • Set the checkpointing to only happen once, after training is finished.
  • Switched training to use CIFAR-10, since I have that downloaded on local disk.

I see this in the logs, which implies to me that AMP is active:

2019-06-03 16:08:40.976829: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-06-03 16:08:40.977057: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.985402: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.986858: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.987745: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.996781: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.001948: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.003208: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.004589: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.005981: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.511761: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-06-03 16:08:41.527751: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1723] Converted 529/2910 nodes to float16 precision using 3 cast(s) to float16 (excluding Const and Variable casts)

But the actual runtime is slower:

fp32 (cyan) runtime is less than all of the fp16 runs.

What can I do to see a performance improvement?

Eli Stevens
  • 111
  • 6

0 Answers0