10

I am working with small numbers in tensorflow, which sometimes results in numerical instability.

I would like to increase the precision of my results, or at the very least determine bounds on my result.

The following code shows a specific example of numerical errors (it outputs nan instead of 0.0, because float64 is not precise enough to handle 1+eps/2):

import numpy as np
import tensorflow as tf

# setup
eps=np.finfo(np.float64).eps
v=eps/2
x_init=np.array([v,1.0,-1.0],dtype=np.float64)

x=tf.get_variable("x", initializer=tf.constant(x_init))
square=tf.reduce_sum(x)
root=tf.sqrt(square-v)

# run
with tf.Session() as session:
    init = tf.global_variables_initializer()
    session.run(init)

    ret=session.run(root)
    print(ret)

I am assuming there is no way to increase the precision of values in tensorflow. But maybe it is possible to set the rounding mode, as in C++ using std::fesetround(FE_UPWARD)? Then, I could force tensorflow to always round up, which would make sure that I am taking the square root of a non-negative number.


What I tried: I tried to follow this question that outlines how to set the rounding mode for python/numpy. However, this does not seem to work, because the following code still prints nan:

import numpy as np
import tensorflow as tf

import ctypes
FE_TONEAREST = 0x0000 # these constants may be system-specific
FE_DOWNWARD = 0x0400
FE_UPWARD = 0x0800
FE_TOWARDZERO = 0x0c00
libc = ctypes.CDLL('libm.so.6') # may need 'libc.dylib' on some systems

libc.fesetround(FE_UPWARD)

# setup
eps=np.finfo(np.float64).eps
v=eps/2
x_init=np.array([v,1.0,-1.0],dtype=np.float64)

x=tf.get_variable("x", initializer=tf.constant(x_init))
square=tf.reduce_sum(x)
root=tf.sqrt(square-v)

# run
with tf.Session() as session:
    init = tf.global_variables_initializer()
    session.run(init)
    ret=session.run(root)
    print(ret)
aL_eX
  • 1,453
  • 2
  • 15
  • 30
Peter
  • 401
  • 1
  • 5
  • 17
  • You have a bug in your provided code; you're always calculating 0 in your setup: `square=tf.reduce_sum(x) # equals v` followed by `root=tf.sqrt(square-v)` – Multihunter Jan 26 '18 at 04:11
  • Also, can you give any good reason why `eps=np.finfo(np.float32).eps` is unacceptably large? – Multihunter Jan 26 '18 at 04:18
  • @Multihunter This is just a minimal example that illustrates the problem (it should calculate `0`, but it returns `nan`). In real life, these errors can e.g. occur when calculating the sample variance or sample correlation of an array. `eps` is not too large in itself, but it can lead to wrong results due to cancellation (as `nan` in my example). – Peter Jan 26 '18 at 07:31
  • That's what I'm saying. I ran the code, and it always gives 0. My feeling is that tf.reduce_sum() is somehow avoiding falling out of precision. If you replace `tf.reduce_sum(x)` with `x[0]+x[1]+x[2]`, then you get `nan`. I'm using tensorflow v1.3.0. How about you? – Multihunter Jan 29 '18 at 01:20
  • @Multihunter Hmm, interesting. I am using tensorflow version `1.4.1` on `Python 3.5.2`. Seems like the behavior can change from version to version. All the more reason to find a general solution... – Peter Jan 29 '18 at 07:54
  • 1
    I just tested on tensorflow version 1.5.0 (latest, just came out a few days ago), and your code produces `nan`. So, something changed between v1.3 and v1.4 in reduce_sum that changes it's ability to handle intermediary results falling outside precision (and it's still there). If operations like reduce_sum were able to work like tf.reduce_sum does in v1.3.0, would that satisfy your requirements (if not your question)? (I'm assuming that you'll be doing vectorised operations like that) – Multihunter Jan 29 '18 at 08:16
  • @Multihunter I was hoping for a general solution, not one that relies on a specific implementation of `reduce_sum`. But it would probably solve my specific problem, so I would absolutely be grateful for a solution that achieves this! – Peter Jan 29 '18 at 10:26

1 Answers1

0

Replace

ret=session.run(root)

with

ret = tf.where(tf.is_nan(root), tf.zeros_like(root), root).eval()

Refer tf.where

Sreeragh A R
  • 2,871
  • 3
  • 27
  • 54
  • The is a possible work-around, thanks! But I am guessing this makes the function non-differentiable? Also, this only works in this specific case of the square root. It would be nice to have a more generic solution. – Peter Jan 24 '18 at 09:30