0

I've been trying to add some hopefully useful intermediate calculations used to derive my loss function to the eval_metric_ops dictionary for my evaluation EstimatorSpec. I have wrapped these in a call to tf.metrics.mean as it seemed to fit my needs.

The return type of this function is a tuple of (mean, update_op), where mean is ostensibly the current mean and update_op is an operation that computes the new mean and returns it.

However, when I try to evaluate it I see that the value and update_op fields seem to be different. The documentation doesn't provide an explanation for this as far as I can see.

For example, take the following snippet of code:

    test_tensor = tensorflow.constant([[1, 2, 3], [4, 5, 6]])
    test_mean = tensorflow.metrics.mean(test_tensor)

    sess = tensorflow.Session()
    sess.run(tensorflow.global_variables_initializer())
    sess.run(tensorflow.local_variables_initializer())

    print sess.run(test_mean)
    print sess.run(test_mean)
    print sess.run(test_mean)
    print sess.run(test_mean)

    print sess.run(test_mean[0])
    print sess.run(test_mean)[1]

This returns the following:

(0.0, 3.5)
(1.75, 3.5)
(2.3333333, 3.5)
(2.625, 3.5)
3.5
3.5

The second value of the tuple is obviously an overall average of the input value, but the left hand side values seem to be asymptoting towards 3.5, while the taking the zero-index of test_mean and evaluating it results in 3.5 directly, as opposed to the value that I get by evaluating the whole operation and then taking the index.

What is happening here?

quant
  • 21,507
  • 32
  • 115
  • 211

1 Answers1

2

Yes, it seems there is a bug currently in the running metrics in tf.metrics. A bug has been filed here and the discussion that triggered here there.

P-Gn
  • 23,115
  • 9
  • 87
  • 104
  • Thanks, that's helpful to know. What is the behaviour that you would expect? (should all values be 3.5?) – quant May 20 '18 at 11:30
  • Yes, that's how I interpret the docs. At the very least the left hand side should be a running mean (either before the update or after), but it currently returns values that are neither and that are simply incorrect. – P-Gn May 20 '18 at 11:33