Reverse engineering unknown floating point format

Question

I'm trying to reverse engineer some old file format (Cinema4D old version), for which I cannot find the specification.

In this file format, I've managed to find that float values are stored as four bytes, but they don't seem to be normal IEEE format, this is not an endian issue. I've spent a lot of time lately with hex<->float conversion tools to figure this out.

Here are some sample values:

0     = 00 00 00 00
1     = 80 00 00 41
2     = 80 00 00 42
4     = 80 00 00 43
8     = 80 00 00 44

0.25  = 80 00 00 3F
16384 = 80 00 00 4F

My observation from two lines above is that something seems to wrap around here, when going from 3F to 4F

1.5  = C0 00 00 41
2.5  = A0 00 00 42

-1   = 80 00 00 C1
-1.5 = C0 00 00 C1
-2   = 80 00 00 C2
-3   = C0 00 00 C2

So, here are some observations:

Increasing the last byte +1, doubles the value
If the high bit of the last byte is set, the number is negative
The first byte does something with non-integer values

Although there are some obvious patterns, and there's some exponent/mantissa going on, I haven't been able to figure this out. Maybe I'm even missing something obvious and it's normal IEEE ? Figuring out how many bits for mantissa/exponent etc isn't the problem (in the examples above, two middle bytes are zero), first I need to figure out the formula to get to the floating point value

It could be that the first three bytes are the full significand (with the leading 1 included). — Sneftel, Jan 13 '16 at 18:35
@Sneftel: That's what it looks like to me, too. Treat the 1st three bytes as giving a significand in `[0.5, 1.0)`, top bit of last byte is the sign, and remaining 7 bits give an excess-64 exponent. This doesn't match any common floating-point format that I'm aware of, though (not IEEE 754, not VAX, not IBM, not Cray, ...). — Mark Dickinson, Jan 13 '16 at 18:45
Thank you very much both, this is exactly what it is (this is probably before fpu was common, and maybe it's faster to implement in software when leading 1 is included) — user1643428, Jan 13 '16 at 19:04
If they are showing a leading one in the significand, watch out for the possibility of non-normalized numbers in your input. That is, it may be possible to also express 2 as 40 00 00 43. — Patricia Shanahan, Jan 14 '16 at 00:12

njuffa · Answer 1 · 2021-03-18T23:17:33.527

The clue here is that Cinema 4D had its debut on the Commodore Amiga platform, which used the FFP floating-point format, which seems to have been designed for easy software emulation. It is explained in chapter 35 of the Amiga ROM Kernel Reference Manual:

The mantissa is considered to be a binary fixed-point fraction; except for 0, it is always normalized (the mantissa is shifted over and the exponent adjusted, so that the mantissa has a 1 bit in its highest position). Thus, it represents a value of less than 1 but greater than or equal to 1/2.

The exponent is the power of two needed to correctly position the mantissa to reflect the number's true arithmetic value. It is held in excess-64 notation, which means that the two's-complement values are adjusted upward by 64, thus changing $40 (-64) through $3F (+63) to $00 through $7F

The value of 0 is defined as all 32 bits being 0s

The mantissa bits are stored in the most significant three bytes, while the least significant byte consists of the sign bit in its most significant bit and the biased exponent in the least significant six bits. Except for zero, the numeric value of a 32-bit number x is therefore (-1)^x<7> * (x<31:8> / 2²⁴) * 2^{(x<6:0> - 64)}.

Based on this, the following ISO-C99 code provides a function decode_ffp() that returns the numeric value of a FFP floating-point number supplied in the form of an unsigned 32-bit integer. Note that the behavior of pseudo-zero and unnormalized encodings is left undefined, as the official documentation does not state how they should be treated.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <math.h>

float decode_ffp (uint32_t a)
{
    const uint32_t FFP_EXPO_BIAS = 64;
    const uint32_t FFP_MANT_BITS = 24;
    const uint32_t FFP_EXPO_BITS = 7;
    const uint32_t FFP_EXPO_MASK = (1 << FFP_EXPO_BITS) - 1;
    uint32_t mant = a >> (FFP_EXPO_BITS + 1);
    uint32_t sign = (a >> FFP_EXPO_BITS) & 1;
    int32_t expo = (a & FFP_EXPO_MASK) -  FFP_EXPO_BIAS;
    float val;

    if (a == 0) {
        val = 0.0f;
    } else {
        val = exp2f (expo) * mant / (1 << FFP_MANT_BITS);
        val = (sign) ? (-val) : val;
    }
    return val;
}

int main (void)
{
    uint32_t test_vec[] = {
        0x00000000,
        0x80000041,
        0x80000042,
        0x80000043,
        0x80000044,
        0x8000003F,
        0x8000004F,
        0xC0000041,
        0xA0000042,
        0x800000C1,
        0xC00000C1,
        0x800000C2,
        0xC00000C2
    };
    int num_test_vec = sizeof test_vec / sizeof test_vec[0];

    for (int i = 0; i < num_test_vec; i++) {
        printf ("%08x ==> % 15.8e\n", test_vec[i], decode_ffp (test_vec[i]));
    }
    return EXIT_SUCCESS;
}

Reverse engineering unknown floating point format

1 Answers1