5

From the value we can infer that it uses the same components as double-precision floating-point hardware. But double has 53 bits of significand, so why is AVX512-IFMA limited to 52 bits? Sure the mantissa has only 52 bits and one bit is hidden, but it still contributes to the value and needs to be fed into the adder/multiplier/divider...

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • 2
    The fact that the leading bit is always 1 means that they can "hard-code" that bit into the multiplier itself. So the multiplier only needs to be 52 bits wide. So the IFMA instructions is probably implemented as the same DP-multiplier, but without the normalization and without the "special-handling" for the leading 1 bit. – Mysticial Mar 11 '16 at 21:25

2 Answers2

4

IEEE-754 double precision actually only has 52 explicitly stored bits, the 53rd bit (the most significant bit) is an implicit 1.

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • Yes, but don't you need a 53-bit adder/multiplier to do the job? Those hidden bits still cause carry and affect the final result. – phuclv Mar 05 '15 at 04:05
  • Probably more than 53 bits, I would guess, but I have no idea how the integer stuff is piggy-backed off the floating point ALU. – Paul R Mar 05 '15 at 06:34
0

It exploits DPF arithmetic units and FMA to achieve fast multi-precision multiplication, for details, see Section I in this paper: Faster Modular Exponentiation Using Double Precision Floating Point Arithmetic on the GPU

As it conducts multiplications on dpf multiplication unit as the paper introduced, if operands are sampled with 53 bits, it would break down the unified processing steps.

weir007
  • 1
  • 2
  • 2
    Yes, that's why AVX512IFMA exists, to expose the mantissa multipliers directly, instead of [only via type-punning to `double`](https://stackoverflow.com/questions/41403718/can-i-use-the-avx-fma-units-to-do-bit-exact-52-bit-integer-multiplications). But the question is why 52 bits instead of 53? – Peter Cordes Nov 05 '20 at 04:26
  • I think the point of IFMA is that you don't have to use *any* FP math instructions or `double` formats anymore, just pure SIMD integer. So there's no real need to limit it to the same width that was possible before. Software could get 106 instead of 104 total bits, unless the mantissa multipliers aren't fully flexible for the implicit 53rd bit. I think that's most likely, that the multipliers special-case that implicit bit. – Peter Cordes Nov 06 '20 at 10:06