10

I would like a way to reduce the precision of floats in TensorFlow (approximately: truncate the mantissa) to an arbitrary number of bits within a defined full range. I don't need to write code entirely in reduced precision (like tf.float16), but rather to come up with a series of operations that reduce the precision of a tensor while leaving it the original type (eg tf.float32).

For example, if the full range is 0 to 1, and the precision is 8 bit, 0.1234 would become round(0.1234 * 256) / 256 = 0.125. This uses simple rounding.

I would also like to do statistical rounding, where the probability of rounding in each direction is proportional to how far the value is from that. For example, 0.1234 * 256 = 31.5904, which would round up to 32/256 59% of the time, and to 31/256 41% of the time.

Extra question: How to take an existing graph and modify it to add rounding after every convolution?

patapouf_ai
  • 17,605
  • 13
  • 92
  • 132
Alex I
  • 19,689
  • 9
  • 86
  • 158

1 Answers1

2

The only tricky part is to provide the gradients to the rounding operation. The already implemented tf.round does not have a gradient implemented. But you can implement your own rounding operation (statistical or simple rounding both work) as shown here: Tensorflow: How to write op with gradient in python?

Where you can simply use:

grad(round(T)) = round(grad(T))

Now once you have your personalized round operation which transfers gradients you can simply do:

def reduce_precision(tensor, precision_bits=8):
    N = 2**precision_bits
    return round(N * tensor)/N

And for the stochastic rounding, you can create a simple numpy function like

def stochastic_round(x):
    r,f = np.modf(x)
    return r + np.random.binomial(1,r)

and then tensoflow-ize it as shown in How to make a custom activation function with only Python in Tensorflow?

where you can define it's gradient operation as

def grad_stochastic_round(op, grad):
    return stochastic_round(grad)
Community
  • 1
  • 1
patapouf_ai
  • 17,605
  • 13
  • 92
  • 132
  • Cool! I think in my case grad(round(T)) = round(grad(T)). Any tips for how to do statistical rounding? – Alex I Mar 28 '17 at 20:20
  • You can do it with a normal python / numpy function and convert it to tensorflow as shown here: http://stackoverflow.com/questions/39921607/tensorflow-how-to-make-a-custom-activation-function-with-only-python/39921608#39921608 – patapouf_ai Mar 28 '17 at 20:27
  • 1
    the python function itself is very simple since you can do something like `def round(x): np.floor(x) + np.random.binomial(1,np.modf(x)[0])` – patapouf_ai Mar 28 '17 at 20:34
  • and `def grad_round(op, grad): return grad` – patapouf_ai Mar 28 '17 at 20:34