what's this decompiled f2xm1/fscale sequence meant to do?

Question

I am trying to reverse engineer some decomiled code which originally had been written in C/C++, i.e. I suspect that the below FPU related code sequence is probably derived from some simple C-code "double" handling that justs looks more complicated in the generated assembly code. Leading up to this point, some floating point multiplications had been performed with the result in ST0 (corresponding to d1). I've read the docs on what the underlying FPU operations technically do, still the intention of the respective code sequence still isn't obvious to me.

  d1 = (float10)1.442695040888963 *(float10)0.6931471805599453 * (float10)DOUBLE_00430088 * (float10)param_1[0x58];
  d2 = ROUND(d1);
  d1 = (float10)f2xm1(d1 - d2);
  d1 = (float10)fscale((float10)1 + d1,d2);

what is the intended change performed to the original d1 result, i.e. what would the C code have done with the original d1 "double"?

PS: below the actual x86 code (just in case Ghidra misinterpreted something while decompiling):

                     * Push ST(i) onto the FPU register stack, i.e.               *
                     * duplicate ST0                                              *
                     **************************************************************
0040aeef d9 c0           FLD        ST0
                     **************************************************************
                     * Rounds the source value in the ST(0) register              *
                     * to the nearest integral value, depending on                *
                     * the current rounding mode (lets suppose some               *
                     * "floor" mode was used?)                                    *
                     **************************************************************
0040aef1 d9 fc           FRNDINT
                     **************************************************************
                     * Exchange the contents of ST(0) and ST(1).                  *
                     **************************************************************
0040aef3 d9 c9           FXCH
                     **************************************************************
                     * get fractional part?                                       *
                     **************************************************************
0040aef5 d8 e1           FSUB       d2[0],d1[0]
                     **************************************************************
                     * Computes the exponential value of 2 to the power           *
                     * of the source operand minus 1.                             *
                     **************************************************************
0040aef7 d9 f0           F2XM1
                     **************************************************************
                     * Push +1.0 onto the FPU register stack.                     *
                     **************************************************************
0040aef9 d9 e8           FLD1
                     **************************************************************
                     * Add ST(0) to ST(1), store result in ST(1),                 *
                     * and pop the register stack.                                *
                     **************************************************************
0040aefb de c1           FADDP
                     **************************************************************
                     * Scale ST(0) by ST(1).  This instruction provides           *
                     * rapid multiplication or division by integral               *
                     * powers of 2.                                               *
                     **************************************************************
0040aefd d9 fd           FSCALE
                     **************************************************************
                     * The FSTP instruction copies the value in the ST(0)         *
                     * register to the destination operand and then pops          *
                     * the register stack.                                        *
                     **************************************************************
0040aeff dd d9           FSTP       d1[0]

`FSUB d2[0],d1[0]` is not a valid x87 instruction in standard assembly syntax (Intel or AT&T) I recognize. Can you provide standard disassembly? — njuffa, Jan 09 '22 at 22:34
@njuffa: Presumably it's tracking source-level operand names through the disassembly. But yeah, good point that it doesn't clearly show which one is ST(0)! The C decompilation is presumably accurate. — Peter Cordes, Jan 09 '22 at 22:39
I did the disassembly work myself now. In Intel syntax: `D8 E1 fsub st,st(1)`. So yes, it is computing the fractional part as x - FRNDINT(x). The description in the question seems a bit confused/incorrect, but this seems to be the computation of exp(x) via F2XM1. — njuffa, Jan 09 '22 at 22:41

njuffa · Accepted Answer · 2022-01-10T00:27:42.823

Without the complete context it is difficult to be entirely sure, but the computation here seems to be a naive computation of exp(x) via F2XM1. Note that the argument to F2XM1 was restricted to [-0.5, 0.5] on early x87 implementations and [-1, 1] on later ones. Computing the integer part of the argument with FRNDINT and subtracting it from the argument yields a fractional part suitable for use by F2XM1. The code presumably assumes that the default rounding mode, round-to-nearest-or-even, is in effect.

Below is the entire annotated instruction sequence for a simple exp(x) computation, programmed to match the disassembled code as closely as possible. I used the inline assembly facility of the Intel C/C++ compiler, so this uses Intel syntax. In the comments, the caret ^ denotes exponentiation. The output of this program is:

x=2.5000000000000000e+000 exp(x)=1.2182493960703475e+001 lib=1.2182493960703473e+001

#include <stdio.h>
#include <math.h>

int main (void)
{
    double r, x = 2.5; 

    __asm fld qword ptr[x]; // x
    __asm fldl2e;           // log2(e)=1.442695040888963, x
    __asm fmulp st(1), st;  // x*log2(e)
    __asm fld st(0);        // x*log2(e), x*log2(e)
    __asm frndint;          // rint(x*log2(e)), x*log2(e)
    __asm fxch st(1);       // x*log2(e), rint(x*log2(e))
    __asm fsub st, st(1);   // frac(x*log2(e)), rint(x*log2(e))
    __asm f2xm1;            // 2^frac(x*log2(e))-1, rint(x*log2(e))
    __asm fld1;             // 1, 2^frac(x*log2(e))-1, rint(x*log2(e))
    __asm faddp st(1),st;   // 2^frac(x*log2(e)), rint(x*log2(e))
    __asm fscale;           // exp(x)=2^frac(x*log2(e))*2^rint(x*log2(e)), rint(x*log2(e)
    __asm fstp qword ptr[r];// rint(x*log2(e)
    __asm fstp st(0);       // <empty>
    
    printf ("x=%23.16e exp(x)=%23.16e lib=%23.16e\n", x, r, exp(x));
    return 0;
}

wothke · Answer 2 · 2022-01-10T08:37:55.917

0

Seems it is some variation of a pow(x,y) implementation (see How can I write a power function myself? ). Ghidra just made a total mess of it in the decompiled code view.

Tracing the results in the debugger the performed functionality is indeed:

pow((float10)DOUBLE_00430088, (float10)param_1[0x58])

edited Jan 10 '22 at 08:37

answered Jan 09 '22 at 20:23

wothke

130
1
8

Total mess? It seems to accurately represent the asm. It doesn't recognize the idiom and show it as a `pow()` for you, if that's what you mean. But it doesn't make it any *harder* to understand than looking at the asm directly (for people that are fluent in asm), which is what "total mess" would imply to me. – Peter Cordes Jan 09 '22 at 20:47
@Peter Cordes This is the first day in my life that I actually saw those x86 OPs. (I've updated my original post to also show the assignment made directly before the code that I originally did not know what to make of.) I doubt that the harm done by multiplying everything together (completly missing the effect of the FYL2X) could in any way be fixed by later applying a ROUND, "f2xm1" and "fscale" to that garbage input. – wothke Jan 09 '22 at 22:31
Yeah, without taking a log first, it's doing `pow(2^x, y)` or something like that, with some scale factors. `frndint` / `fsub` are separating the integer and fractional parts which is necessary to use `f2xmi`, as explained in Jerry Coffin's answer you linked. IDK why you think the input would be "garbage" just because the code didn't take a logarithm of it. [`fyl2x`](https://www.felixcloutier.com/x86/fyl2x) produces a single output, so the rest of the pow algorithm can still be doing exponentiation of something. – Peter Cordes Jan 09 '22 at 22:46
A multiplication of base and exponent obviously does not yield a standalone input suitable to calc a pow() function result. This would only work in a galaxy where pow(2, 10) equals pow(4, 5). The inital d1 assignment suggested by the Ghirda decompiler in my example is therefore extremely misleading and I would prefer if Ghidra clearly showed when it doesn't know how to decompile something rather than show a result that looks as if represented the assembly logic when in fact it doesn't, – wothke Jan 09 '22 at 23:07
Ok right, on 2nd look this is some kind of exponentiation, but not with a runtime-variable base. The input is `DOUBLE_00430088 * param_1[0x58]` times some constant scale factors. (Oh, njuffa's answer says the same thing, and breaks it down in detail.) – Peter Cordes Jan 09 '22 at 23:37

what's this decompiled f2xm1/fscale sequence meant to do?

2 Answers2