How to save the float16 (https://en.wikipedia.org/wiki/Half-precision_floating-point_format) max number in float32 (https://en.wikipedia.org/wiki/Single-precision_floating-point_format) format?
I want to have a function which could convert 0x7bff to 65504. 0x7bff is the max value can be represented by floating point half precision:
0 11110 1111111111 -> decimal value: 65504
I want to have 0x7bff to represent the actual bits in my program.
float fp16_max = bit_cast(0x7bff);
# want "std::cout << fp16_max" to be 65504
I tried to implement such a function but it didn't seem to work:
float bit_cast (uint32_t fp16_bits) {
float i;
memcpy(&i, &fp16_bits, 4);
return i;
}
float test = bit_cast(0x7bff);
# print out test: 4.44814e-41