How does GCC compile the 80 bit wide 10 byte float __float80 on x86_64?

Question

According to one of the slides in the video by What's A Creel video, "Modern x64 Assembly 4: Data Types" (link to the slide),

Note: real10 is only used with the x87 FPU, it is largely ignored nowadays but offers amazing precision!

He says,

"Real10 is only used with the x87 Floating Point Unit. [...] It's interesting the massive gain in precision that it offers you. You kind of take a performance hit with that gain because you can't use real10 with SSE, packed, SIMD style instructions. But, it's kind of interesting because if you want extra precision you can go to the x87 style FPU. Now a days it's almost never used at all."

However, I was googling and saw that GCC supports __float80 and __float128.

Is the __float80 in GCC calculated on the x87? Or it is using SIMD like the other float operations? What about __float128?

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

GCC docs for Additional Floating Types:

ISO/IEC TS 18661-3:2015 defines C support for additional floating types _Floatn and _Floatnx

... GCC does not currently support _Float128x on any systems.

I think _Float128x is IEEE binary128, i.e. a true 128-bit float with a huge exponent range. See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1691.pdf.

__float80 is obviously the x87 10-byte type. In the x86-64 SysV ABI, it's the same as long double; both have 16-byte alignment in that ABI.

__float80 is available on the i386, x86_64, and IA-64 targets, and supports the 80-bit (XFmode) floating type. It is an alias for the type name _Float64x on these targets.

I think __float128 is an extended-precision type using SSE2, presumably a "double double" format with twice the mantissa width but the same exponent limits as 64-bit double. (i.e. less exponent range than __float80)

On i386, x86_64, and ..., __float128 is an alias for _Float128

Those are probably the same doubledouble that gcc gives you with __float128. Or maybe it's a pure software floating point 128-bit

Godbolt compiler explorer for gcc7.3 -O3 (same as gcc4.6, apparently these types aren't new)

//long double add_ld(long double x) { return x+x; }  // same as __float80
__float80 add80(__float80 x) { return x+x; }

    fld     TBYTE PTR [rsp+8]    # arg on the stack
    fadd    st, st(0)
    ret                          # and returned in st(0)


__float128 add128(__float128 x) { return x+x; }

          # IDK why not movapd or better movaps, silly compiler
    movdqa  xmm1, xmm0           # x arg in xmm0
    sub     rsp, 8               # align the stack
    call    __addtf3             # args in xmm0, xmm1
    add     rsp, 8
    ret                          # return value in xmm0, I assume


int size80 = sizeof(__float80);    // 16
int sizeld = sizeof(long double);  // 16

int size128 = sizeof(__float128);  // 16

So gcc calls a libgcc function for __float128 addition, not inlining an increment to the exponent or anything clever like that.

__float128 is **not** double double, at least on x86_64 it is a software implementation of ieee binary128. On Power computers, long double was traditionally a double-double type, but they are moving away from it. — Marc Glisse, Apr 19 '18 at 00:33
@MarcGlisse: I was having a hard time finding definitions of `_Float128` and `_Float128x`. So `_Float128` is ieee binary128, but what's `_Float128x`? Feel free to edit this answer, if you have time. — Peter Cordes, Apr 19 '18 at 00:54
`_Float128x` should be the extended format associated with IEEE binary128. I.e. it should have at least the exponent width of binary256 (i.e. *emax* ≥ 65535) and precision between that of binary128 and binary256 (in particular, *p* digits ≥ 128). See §3.7 of IEEE 754-2019 (same section in 2008 revision) for details on these extended formats. The correspondence between `_Float128x` and `_Float128` is *almost* the same as `__float80` vs `double` on gcc — except for the explicitly stored MSB of the significand in `__float80`, which is not IEEE-compliant. — Ruslan, Oct 26 '19 at 09:29

score 0 · Answer 2 · answered Apr 18 '18 at 23:07

I found the answer here

__float80 is available on the i386, x86_64, and IA-64 targets, and supports the 80-bit (XFmode) floating type. It is an alias for the type name _Float64x on these targets.

Having looked up the XFmode,

“Extended Floating” mode represents an IEEE extended floating point number. This mode only has 80 meaningful bits (ten bytes). Some processors require such numbers to be padded to twelve bytes, others to sixteen; this mode is used for either.

Still not totally convinced, I compiled something simple

int main () {
    __float80 a = 1.445839898;
    return 1;
}

Using Radare I dumped it,

0x00000652      db2dc8000000   fld xword [0x00000720]
0x00000658      db7df0         fstp xword [local_10h]

I believe fld, and fstp are part of the x87 instruction set. So it's true it's being used for the __float80 10 byte float, however on the __float128, I'm getting

0x000005fe      660f6f05aa00.  movdqa xmm0, xmmword [0x000006b0]
0x00000606      0f2945f0       movaps xmmword [local_10h], xmm0

So we can see here that we're using SIMD xmmword

It may use SIMD to move data around, that doesn't mean it will actually use it to do real operations. — Marc Glisse, Apr 19 '18 at 00:32

How does GCC compile the 80 bit wide 10 byte float __float80 on x86_64?

2 Answers2

Linked

Related