I am computing mean and standard deviation in numpy. To increase performance, I tried the same in Tensorflow but Tensorflow was at least ~10x slower. I tried 2 approaches in Tensorflow (code below). The first approach uses tf.nn.moments()
, which has a bug causing it to sometimes return a negative value for variance. In the second approach I calculate variance via other Tensorflow functions.
I tried CPU-only and GPU; numpy is always faster.
I used time.time()
rather than time.clock()
in order to measure wall-clock time when using GPU.
Why is Tensorflow slower? I thought it might be due to transferring data into the GPU, but TF is slower even for very small datasets (where transfer time should be negligible), and when using CPU only. Is this due to overhead time required to initialize TF?
import tensorflow as tf
import numpy
import time
import math
class Timer:
def __enter__(self):
self.start = time.time()
return self
def __exit__(self, *args):
self.end = time.time()
self.interval = self.end - self.start
inData = numpy.random.uniform(low=-1, high=1, size=(40000000,))
with Timer() as t:
mean = numpy.mean(inData)
print 'python mean', mean, 'time', t.interval
with Timer() as t:
stdev = numpy.std(inData)
print 'python stdev', stdev, 'time', t.interval
# Approach 1 (Note tf.nn.moments() has a bug)
with Timer() as t:
with tf.Graph().as_default():
meanTF, varianceTF = tf.nn.moments(tf.constant(inData), axes=[0])
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
mean, variance = sess.run([meanTF, varianceTF])
sess.close()
print 'variance', variance
stdev = math.sqrt(variance)
print 'tensorflow mean', mean, 'stdev', stdev, 'time', t.interval
# Approach 2
with Timer() as t:
with tf.Graph().as_default():
inputVector = tf.constant(inData)
meanTF = tf.reduce_mean(inputVector)
length = tf.size(inputVector)
varianceTF = tf.divide(tf.reduce_sum(tf.squared_difference(inputVector, mean)), tf.to_double(length))
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
mean, variance = sess.run([meanTF, varianceTF])
sess.close()
print 'variance', variance
stdev = math.sqrt(variance)
print 'tensorflow mean', mean, 'stdev', stdev, 'time', t.interval