I would like a way to reduce the precision of floats in TensorFlow (approximately: truncate the mantissa) to an arbitrary number of bits within a defined full range. I don't need to write code entirely in reduced precision (like tf.float16), but rather to come up with a series of operations that reduce the precision of a tensor while leaving it the original type (eg tf.float32).
For example, if the full range is 0 to 1, and the precision is 8 bit, 0.1234 would become round(0.1234 * 256) / 256 = 0.125. This uses simple rounding.
I would also like to do statistical rounding, where the probability of rounding in each direction is proportional to how far the value is from that. For example, 0.1234 * 256 = 31.5904, which would round up to 32/256 59% of the time, and to 31/256 41% of the time.
Extra question: How to take an existing graph and modify it to add rounding after every convolution?