12

Using floating point, It is known that the quadratic formula does not work well for b^2>>4ac, because it will produce a loss of significance, as it is explained here.

I am asked to find a better way to solve quadratic equations, I know there is this algorithm. Are there any other formulas that work better? How can I come up with better formulas? I tried to algebraically manipulate the standard equation, without any results.

John Keeper
  • 245
  • 2
  • 7
  • 3
    You can use scaling by powers of two (see e.g. W.Y. Sit, Quadratic Programming? http://www.mmrc.iss.ac.cn/ascm/ascm03/sample.pdf) and high precision calculation of the discriminant (W.Kahan, On the Cost of Floating-Point Computation, http://www.eecs.berkeley.edu/~wkahan/Qdrtcs.pdf) – gammatester Feb 26 '18 at 09:47
  • 2
    Related at Math.SE: [Numerically stable algorithm for solving the quadratic equation when a is very small or 0](https://math.stackexchange.com/q/866331/64206) – Ruslan Apr 05 '18 at 10:18

2 Answers2

22

This answer assumes that the primary concern here is robustness with regard to accuracy, rather than robustness with regard to overflow or underflow in intermediate floating-point computations. The question indicates an awareness of the problem of subtractive cancellation when the commonly used mathematical formula is applied directly using floating-point arithmetic, and the techniques to work around it.

An additional issue to be considered is the accurate computation of the term b²-4ac. It is examined in detail in the following research note:

William Kahan, "On the Cost of Floating-Point Computation Without Extra-Precise Arithmetic", Nov. 21, 2004 (online)

Recent follow-up work to Kahan's note looked at the more general issue of computing the difference of two products ab-cd:

Claude-Pierre Jeannerod, Nicolas Louvet, Jean-Michel Muller, "Further analysis of Kahan's algorithm for the accurate computation of 2 x 2 determinants." Mathematics of Computation, Vol. 82, No. 284, Oct. 2013, pp. 2245-2264 (online)

This makes use of the fused multiply-add operation, or FMA, which is available on almost all modern processors including x86-64, ARM64, and GPUs. It is exposed in C/C++ as a standard math function fma(). Note that on platforms without hardware support for FMA, fma() must use emulation, which is often quite slow, and some emulations have been found to have serious functional deficiencies.

FMA computes a*b+c using the full product (neither rounded nor truncated in any way) and applies a single rounding at the end. This allows the accurate computation of the product of two native-precision floating-point numbers as the unevaluated sum of two native-precision floating-point numbers, without resorting to the use of extended-precision arithmetic in intermediate computation: h = a * b and l = fma (a, b,- h) where h+l represents the product a*b exactly. This provides for the efficient computation of ab-cd as follows:

/*
  diff_of_products() computes a*b-c*d with a maximum error <= 1.5 ulp

  Claude-Pierre Jeannerod, Nicolas Louvet, and Jean-Michel Muller, 
  "Further Analysis of Kahan's Algorithm for the Accurate Computation 
  of 2x2 Determinants". Mathematics of Computation, Vol. 82, No. 284, 
  Oct. 2013, pp. 2245-2264
*/
double diff_of_products (double a, double b, double c, double d)
{
    double w = d * c;
    double e = fma (-d, c, w);
    double f = fma (a, b, -w);
    return f + e;
}

With this building block, the real roots of a quadratic equation can be computed with high accuracy as follows, provided the discriminant is positive:

/* compute the real roots of a quadratic equation: ax² + bx + c = 0, 
   provided the discriminant b²-4ac is positive
*/
void solve_quadratic (double a, double b, double c, double *x0, double *x1)
{
    double q = -0.5 * (b + copysign (sqrt (diff_of_products (b, b, 4.0*a, c)), b));
    *x0 = q / a;
    *x1 = c / q;
}

In extensive testing with test cases that do not overflow or underflow in intermediate computation, the maximum error observed in the computed solutions never exceeded 3 ulps.

njuffa
  • 23,970
  • 4
  • 78
  • 130
  • So if I want a unit test to show that the fma version is unambiguously better than the naive version, I want to expand (x-x0)(x-x1) where |x0| << |x1|? Is this the correct regime? – user14717 Dec 26 '18 at 21:58
  • Actually, I think it's the opposite regime: |x0 - x1| < eps*max(|x0|, |x1|). – user14717 Dec 26 '18 at 22:08
  • @user14717 I simply used a high-precision reference with random `a`, `b`, `c` selected such that a quick & dirty discriminant came out positive. With a modern PC you can run through billions of test vectors easily. I used 100 billion test cases, letting this run for a couple of hours or so. Obviously that's way too long for a unit test. For that, you might want to pre-generate "hard" cases by letting the robust and naive methods run side-by-side, extracting those cases that have large error with the naive approach. – njuffa Dec 26 '18 at 22:26
  • Kahan's quote from the article you reference: "Competent tests are often more difficult to design than was the program under test. Ideally, tests should be arbitrarily numerous, randomized, prioritized to expose gross blunders soon, filtered to avoid wasting time testing the same possibilities too often, distributed densely enough along boundaries where bugs are likely to hide, reproducible whenever tests that exposed a bug have to be rerun on a revised version of the program under test, and fast. It’s a lot to ask." – user14717 Dec 27 '18 at 00:54
  • All of your sources appear to time out for me. – Krupip Aug 21 '19 at 19:30
  • @njuffa never mind It was the local network I was on that was at fault. – Krupip Aug 21 '19 at 19:34
  • Note that your `diff_of_products(...)` function is similar to the cited reference (see Algorithm 1), but permutes the variable names, which is confusing if you were expecting `a`, `b`, `c`, `d` to be their usual names in a 2⨯2 matrix. (This has confused me on at least two separate occasions seeing this post, so this is a note to self and others.) – geometrian Oct 17 '22 at 17:18
  • @user14717 - Did you ever find a specific input where this does significantly better than a naive implementation? I am going to adapt this code, but would love to have a unit test. My case of interest is IEEE double precision. – Nemo May 28 '23 at 18:16
  • 1
    @Nemo (I am *not* user14717, obviously): A simple test case, taken from George Forsythe, "How do you solve a quadratic equation", Technical Report CS40, Stanford University 1966: `a=1, b= -1e5, c=1`. – njuffa May 28 '23 at 21:32
  • @njuffa - Thanks. I couldn't "@" more than one person. Very useful, both this and https://stackoverflow.com/q/63665010/ – Nemo May 30 '23 at 18:33
10

The Herbie tool for automatically rearranging floating point expressions to reduce rounding error usually provides a good starting point for addressing errors like this.

In this case you can see its output for the positive root of the quadratic using the online demo, to get these results:

enter image description here

ztatlock
  • 190
  • 2
  • 9