I have Tensorflow network and I ported it in the Caffe format. All weights and algorithm is correct in Caffe and TF, I checked it several times.
But when I run both frameworks and compare their output layer-by-layer, small differences start to appear.
For example, if the result of the BatchNormalization
at the one point in Caffe is 0.940091
then for TF at the same point it is 0.939559
, so the difference is 0.0005319999999999769
. Very small but it gets amplified deeper in the network we go, and at the end values are very different.
For Convolution
difference is present also(-2.7540846 Caffe
-2.7540843 TF
) but it gets more noticeable after first BatchNormalization
I suppose it might be some internal difference and casting between some sort of FP16, FP32, FP64
. Have no idea how TF handles conversion from python graph to C++ graph.
May be someone encountered same problem?