I want to measure the time it takes my session to run inference operations.
I measured it using time
like so
start = time.clock()
sess.run([ops['pred']], feed_dict=feed_dict)
end = time.clock()
elapsed_time_batch = end-start
I do this multiple times on the same dataset and average. The problem is - I get very different average time measurements ( 1.7ms vs 1.2 ms ). Even though its "just" 0.5ms difference its a big relative difference ( 0.5 vs 1.7 is about 30%).
I tried setting some GPU options for the session like so :
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.95
config.gpu_options.allow_growth = False
config.allow_soft_placement = True
config.log_device_placement = False
if limit_gpu:
gpu_idx = str(gpu_idx)
os.environ["CUDA_VISIBLE_DEVICES"] = gpu_idx
else:
os.environ["CUDA_VISIBLE_DEVICES"] = "0, 1"
sess = tf.Session(config=config)
But, this didn't solve it. What can be the cause of this change and how can I stabilize it to get a more reliable time read.
I am running on a linux server with 4 GPUS ( for this test I limited to 1 GPU - a Titan Xp).