I have been struggling with finding a portable way to serialize 32-bit float variables in C and C++ to be sent to and from microcontrollers. I want the format to be well-defined enough so that serialization/de-serialization can be done from other languages as well without too much effort. Related questions are:
Portability of binary serialization of double/float type in C++
Serialize double and float with C
c++ portable conversion of long to double
I know that in most cases a typecast union/memcpy will work just fine because the float representation is the same, but I would prefer to have a bit more control and piece of mind. What I came up with so far is the following:
void serialize_float32(uint8_t* buffer, float number, int32_t *index) {
int e = 0;
float sig = frexpf(number, &e);
float sig_abs = fabsf(sig);
uint32_t sig_i = 0;
if (sig_abs >= 0.5) {
sig_i = (uint32_t)((sig_abs - 0.5f) * 2.0f * 8388608.0f);
e += 126;
}
uint32_t res = ((e & 0xFF) << 23) | (sig_i & 0x7FFFFF);
if (sig < 0) {
res |= 1 << 31;
}
buffer[(*index)++] = (res >> 24) & 0xFF;
buffer[(*index)++] = (res >> 16) & 0xFF;
buffer[(*index)++] = (res >> 8) & 0xFF;
buffer[(*index)++] = res & 0xFF;
}
and
float deserialize_float32(const uint8_t *buffer, int32_t *index) {
uint32_t res = ((uint32_t) buffer[*index]) << 24 |
((uint32_t) buffer[*index + 1]) << 16 |
((uint32_t) buffer[*index + 2]) << 8 |
((uint32_t) buffer[*index + 3]);
*index += 4;
int e = (res >> 23) & 0xFF;
uint32_t sig_i = res & 0x7FFFFF;
bool neg = res & (1 << 31);
float sig = 0.0;
if (e != 0 || sig_i != 0) {
sig = (float)sig_i / (8388608.0 * 2.0) + 0.5;
e -= 126;
}
if (neg) {
sig = -sig;
}
return ldexpf(sig, e);
}
The frexp and ldexp functions seem to be made for this purpose, but in case they aren't available I tried to implement them manually as well using functions that are common:
float frexpf_slow(float f, int *e) {
if (f == 0.0) {
*e = 0;
return 0.0;
}
*e = ceil(log2f(fabsf(f)));
float res = f / powf(2.0, (float)*e);
// Make sure that the magnitude stays below 1 so that no overflow occurs
// during serialization. This seems to be required after doing some manual
// testing.
if (res >= 1.0) {
res -= 0.5;
*e += 1;
}
if (res <= -1.0) {
res += 0.5;
*e += 1;
}
return res;
}
and
float ldexpf_slow(float f, int e) {
return f * powf(2.0, (float)e);
}
One thing I have been considering is whether to use 8388608 (2^23) or 8388607 (2^23 - 1) as the multiplier. The documentation says that frexp returns values that are less than 1 in magnitude, and after some experimentation it seems that 8388608 gives results that are bit-accurate with actual floats and I could not find any corner case where this overflows. That might not be true with a different compiler/system though. If this can become a problem a smaller multiplier which reduces the accuracy a bit is fine with me as well. I know that this does not handle Inf or NaN, but for now that is not a requirement.
So, finally, my question is: Does this look like a reasonable approach, or am I just making a complicated solution that still has portability issues?