Inspired by this question I tried to measure the FLOPS required by tensorflow for a matrix-matrix multiplication.
For two matrices A and B with sizes (m x p) and (p x n), respectively, the resulting matrix C=AB with size (m x n) has mn entries. For each entry, p multiplications and (p-1) summations are required. Hence, the total number of operations is mn(2p-1)
.
With the code from the linked question/answer, tensorflow outputs m*n*2p
, see code below.
Why is this approximation returned and not the theoretical value? In the worst case, p=1, this approximation is factor 2 larger than the correct value.
import numpy as np
import tensorflow as tf
g = tf.Graph()
run_meta = tf.RunMetadata()
with g.as_default():
A=tf.convert_to_tensor(np.random.rand(13,9))
B=tf.convert_to_tensor(np.random.rand(9,7))
C = tf.matmul(A,B) # shape=[13,7]
opts = tf.profiler.ProfileOptionBuilder.float_operation()
flops = tf.profiler.profile(g, run_meta=run_meta, cmd='op', options
=opts)
if flops is not None:
print('Flops should be ', 13*7*(2*9-1))
print('Approximation 2*13*7*9=',2*13*7*9)
print('TF stats gives',flops.total_float_ops)
#Output:
#Flops should be 1547
#Approximation 2*13*7*9= 1638
#TF stats gives 1638