This answer builds on the float8
aspect of the question. The accepted answer covers the rest pretty well.One of the main reasons there isn't a widely accepted float8
type, other than a lack of standard is that its not very useful practically.
Primer on Floating Point
In standard notation, a float[n]
data type is stored using n
bits in memory. That means that at most only 2^n
unique values can be represented. In IEEE 754, a handful of these possible values, like nan
, aren't even numbers as such. That means all floating point representations (even if you go float256
) have gaps in the set of rational numbers that they are able to represent and they round to the nearest value if you try to get a representation for a number in this gap. Generally the higher the n
, the smaller these gaps are.
You can see the gap in action if you use the struct
package to get the binary representation of some float32
numbers. Its a bit startling to run into at first but there's a gap of 32 just in the integer space:
import struct
billion_as_float32 = struct.pack('f', 1000000000 + i)
for i in range(32):
billion_as_float32 == struct.pack('f', 1000000001 + i) // True
Generally, floating point is best at tracking only the most significant bits so that if your numbers have the same scale, the important differences are preserved. Floating point standards generally differ only in the way they distribute the available bits between a base and an exponent. For instance, IEEE 754 float32
uses 24 bits for the base and 8 bits for the exponent.
Back to float8
By the above logic, a float8
value can only ever take on 256 distinct values, no matter how clever you are in splitting the bits between base and exponent. Unless you're keen on it rounding numbers to one of 256 arbitrary numbers clustered near zero its probably more efficient to just track the 256 possibilities in a int8
.
For instance, if you wanted to track a very small range with coarse precision you could divide the range you wanted into 256 points and then store which of the 256 points your number was closest to. If you wanted to get really fancy you could have a non-linear distribution of values either clustered at the centre or at the edges depending on what mattered most to you.
The likelihood of anyone else (or even yourself later on) needing this exact scheme is extremely small and most of the time the extra byte or 3 you pay as a penalty for using float16
or float32
instead is too small to make a meaningful difference. Hence...almost no one bothers to write up a float8
implementation.