For FP16, the minimum positive normal value is:
1 0
5 43210 9876543210
S -E5-- ---F10----
Binary: 0 00001 0000000000
Hex: 0400
Precision: HP
Sign: Positive
Exponent: -14 (Stored: 1, Bias: 15)
Hex-float: +0x1p-14
Value: +6.1035156e-5 (NORMAL)
The minimum positive subnormal value is:
1 0
5 43210 9876543210
S -E5-- ---F10----
Binary: 0 00000 0000000001
Hex: 0001
Precision: HP
Sign: Positive
Exponent: -14 (Stored: 0, Bias: 14)
Hex-float: +0x1p-24
Value: +5.9604645e-8 (DENORMAL)
You can write the former as 0x1p-14
and the latter as 0x1p-24
in your program.
If you want to convert from the underlying hexadecimal representation, then a common trick is to use a union in C and a memcpy
in C++. See this answer for details: How is 1 encoded in C/C++ as a float (assuming IEEE 754 single precision representation)?
Of course, to do this properly, you'd need an underlying 16-bit float type; which is typically not available. So, you'll have to first figure out what the corresponding hexadecimal would be in the 32-bit single precision format. For 1p-24
that's easy to compute in single precision:
3 2 1 0
1 09876543 21098765432109876543210
S ---E8--- ----------F23----------
Binary: 0 01100111 00000000000000000000000
Hex: 3380 0000
Precision: SP
Sign: Positive
Exponent: -24 (Stored: 103, Bias: 127)
Hex-float: +0x1p-24
Value: +5.9604645e-8 (NORMAL)
So the corresponding representation as a single precision float would be 0x33800000
. (This is not hard to see: the bias for 32-bit float is 127, so you'd just put 103 in the exponent to get -24. I trust you can do that easily yourself; if not ask away.)
Now you can write:
#include <inttypes.h>
#include <iostream>
int main(void) {
uint32_t abc = 0x33800000;
float i;
std::memcpy(&i, &abc, 4);
std::cout<< i << std::endl;
return 0;
}
which prints:
5.96046e-08