Best possible accuracy for single precision floating point division

Question

Is it possible to perform division and obtain IEEE-754 single-precision correct values if one is using single-precision add/sub and multiplication hardware only (no FMA)?

Specifically if the Newton-Raphson method for carrying out floating-point division is carried out on single-precision hardware, is it possible to achieve a result that is IEEE-754 correct?

Pascal Cuoq · Accepted Answer · 2014-07-19T08:30:08.763

5

Is it possible to perform division and obtain IEEE-754 single-precision correct values if one is using single-precision add/sub and multiplication hardware only (no FMA)?

Yes.

It is always possible to emulate higher precision by representing numbers as the sum of several single-precision floats, either two, three, or four (see the QD library on this page). You should only need the precision brought by two single-precision numbers for a correctly-rounded single-precision division, and the necessary operations for this representation can be implemented with only single-precision addition, subtraction and multiplication.

edited Jul 19 '14 at 08:30

answered Jul 18 '14 at 14:20

Pascal Cuoq

79,187
7
161
281

Is this the simplest way? Also, are you aware of which stages of the iterative algorithm this is necessary (e.g. all stages, or just the last multiply by the approximation to the reciprocal and the scaled dividend)? – Veridian Jul 18 '14 at 14:25
When the real division result is 2^-20 ULP from the midpoint between two floats, correctly rounding the division means computing an approximation to 2^-20 of an ULP of precision, one way or the other. There is no way around it. Of all the ways of representing higher precision when only single-precision is available, double-single is probably the simplest one, yes. Only the last stage(s) of the algorithm should require the extra precision. – Pascal Cuoq Jul 18 '14 at 14:49
So basically when you perform the final add of the algorithm, you wouldn't round to single precision, you'd retain extra bits, store that result as a double precision value (using two single precision values), and perform the final multiplication (the single precision dividend multiplied by the double precision reciprocal)? – Veridian Jul 18 '14 at 14:55
@starbox single-precision dividend multiplied by double-precision reciprocal ought to always produce the correctly rounded result. This is similar to this question: http://stackoverflow.com/questions/19589973/implementing-single-precision-division-as-double-precision-multiplication – Pascal Cuoq Jul 18 '14 at 14:59
@PascalCuoq: Could you fix your "four" link to link to the site rather than the "tar.gz" download, please! – Mark Dickinson Jul 18 '14 at 17:14
@MarkDickinson I spent a couple of minutes looking for a site that I didn't find. There exists at least a PDF, which I have read, but couldn't find again either. Quad-double is not the easiest thing to google and the QD library never attracted much interest, apparently. The only person I talked to that knew about it told me that it was slower than MPFR (which is based on integer arithmetics). – Pascal Cuoq Jul 18 '14 at 17:40
Hmm; I see what you mean. I found `http://crd-legacy.lbl.gov/~dhbailey/mpdist/`, but the `legacy` part of that is a bit worrying. – Mark Dickinson Jul 18 '14 at 17:42
I tried making the final step of the algorithm into a double precision add and then the final multiply in double precision, but I'm still obtaining the same level of accuracy. I feel like your answer is saying just to use double-precision arithmetic (emulated), and the only real savings is the the dividend was a single-precision value. I'm trying to see if the correct answer can be obtained without such a cumbersome method. Perhaps there is a way to evaluate the final result and adjust it if necessary by comparing it to the dividend and divisor? – Veridian Jul 18 '14 at 23:46
@starbox: Sure - if you've got an approximate answer `c` to `a / b`, you can compute `b * c - a` to get an error term, and then adjust `c` by `error * previously_computed_reciprocal_of_b`. You'll still need some form of extended precision (e.g., double-single and Dekker multiplication) to compute that error term, and some analysis to figure out whether this always gives you correctly rounded results or whether there are still corner cases where it doesn't. What's your aim, by the way? Do you need correctly rounded results always, or is almost always enough? – Mark Dickinson Jul 19 '14 at 06:23
@MarkDickinson, I am comparing the overhead of various floating point units (all single precision). I need to be able to say how bounded my error is with full certainty. I need correctly rounded results always. – Veridian Jul 19 '14 at 16:25

Best possible accuracy for single precision floating point division

1 Answers1