Efficient and accurate computation of the Mills ratio of the standard normal distribution

Question

The Mills ratio M(x) was introduced by John Mills to express the relationship between a distribution's cumulative distribution function and its probability density function:

J. P. Mills, "Table of the ratio: Area to bounding ordinate, for any portion of normal curve". Biometrika, Vol. 18, No. 3/4 (Nov. 1926), pp. 395-400. (online)

The definition of the Mills ratio is (1 - D(x)) / P(x), where D denotes the distribution function and P(x) is the probability density function. In the specific case of the standard normal distribution, we then have M(x) = (1 - Φ(x)) / ϕ(x) = Φ(-x) / ϕ(x), or when expressed via the complementary error function, M(x) = e^x²/2 √(π/2) erfc (x/√2) = √(π/2) erfcx (x/√2).

Previous questions have dealt with the computation of the Mills ratio in mathematical environments like R and Matlab, but the sophisticated computational facilities of these environments have no equivalent in C. How can one compute the Mills ratio for the standard normal distribution accurately and efficiently using just the C standard math library?

score 3 · Accepted Answer · edited May 23 '17 at 12:10

In previous answers I have shown how to use the C standard math library to efficiently and accurately compute the PDF of the standard normal distribution, normpdf(), the CDF of the standard normal distribution, normcdf(), and the scaled complementary error function erfcx(). Based of these three implementations, one could easily code the computation of the Mills ratio in straightforward manner in one of the following two ways:

double my_mills_ratio_1 (double a)
{
    return my_normcdf (-a) / my_normpdf (a);
}

double my_mills_ratio_2 (double a)
{
    const double SQRT_HALF_HI = 0x1.6a09e667f3bccp-01; // 1/sqrt(2), msbs;
    const double SQRT_HALF_LO = 0x1.21165f626cdd5p-54; // 1/sqrt(2), lsbs;
    const double SQRT_PIO2_HI = 0x1.40d931ff62705p+00; // sqrt(pi/2), msbs;
    const double SQRT_PIO2_LO = 0x1.2caf9483f5ce4p-53; // sqrt(pi/2), lsbs;
    double r;

    a = fma (SQRT_HALF_HI, a, SQRT_HALF_LO * a);
    r = my_erfcx (a);
    return fma (SQRT_PIO2_HI, r, SQRT_PIO2_LO * r);
}

However, both of these approaches are numerically flawed. For mills_ratio_1(), both the PDF term and CDF term vanish rapidly in the positive half-plane as the magnitude of the argument increases. In IEEE-754 double precision both become zero around a = 38, leading to a NaN result due to a division of zero by zero. With regard to my_mills_ratio_2(), the exponential growth in the negative half-plane leads to error magnification and therefore large ulp errors. One way to fix this is to simply combine the well-behaved portions of each of the two approximations:

double my_mills_ratio_3 (double a)
{
    return (a < 0) ? my_mills_ratio_1 (a) : my_mills_ratio_2 (a);
}

This works reasonably well. Using the Intel compiler version 13.1.3.198 to build the code I presented in my previous answers, using 4 billion test vectors, a maximum error of 2.79346 ulps is observed in the positive half-plane, while a maximum error of 6.81248 ulps is observed in the negative half-plane. The somewhat larger errors in the negative half-plane occur for large results close to overflow, because at that point the values of the PDF are so small that they are represented as subnormal double-precision numbers with reduced accuracy.

One alternative solution is to address the error magnification issues affecting my_mills_ratio_2() in the negative half-plane. One can do so by computing the argument to erfcx() to better than double precision, and using the low-order bits of this argument for linear interpolation of the erfcx() result.

For this, one also needs the slope of erfcx(x), which is erfcx'(x) = 2x erfcx(x) - 2/√π. Availability of the FMA (fused multiply-add) operation via the C standard math function fma() provides for the efficient implementation of this quasi double-double computation. The risk of overflow in the magnitude of the slope during intermediate computations can be avoided by local rescaling.

The resulting implementation has an error of less than 4 ulps across the entire input domain:

/* Compute Mills ratio of the standard normal distribution:
 *
 * M(x) = normcdf(-x)/normpdf(x) = sqrt(pi/2) * erfcx(x/sqrt(2)) 
 *
 * maximum ulp error in positive half-plane: 2.79346
 * maximum ulp error in negative half-plane: 3.90753
 */ 
double my_mills_ratio (double a)
{
    double s, t, r, h, l;

    const double SQRT_HALF_HI = 0x1.6a09e667f3bccp-01; // 1/sqrt(2), msbs
    const double SQRT_HALF_LO = 0x1.21165f626cdd5p-54; // 1/sqrt(2), lsbs
    const double SQRT_PIO2_HI = 0x1.40d931ff62705p+00; // sqrt(pi/2), msbs
    const double SQRT_PIO2_LO = 0x1.2caf9483f5ce4p-53; // sqrt(pi/2), lsbs
    const double TWO_RSQRT_PI = 0x1.20dd750429b6dp+00; // 2/sqrt(pi)
    const double MAX_IEEE_DBL = 0x1.fffffffffffffp+1023;
    const double SCALE_DOWN = 0.03125; // prevent ovrfl in intermed. computation
    const double SCALE_UP = 1.0 / SCALE_DOWN;

    // Compute argument a/sqrt(2) as a head-tail pair of doubles h:l
    h = fma (SQRT_HALF_HI, a, SQRT_HALF_LO * a);
    l = fma (-SQRT_HALF_LO, a, fma (-SQRT_HALF_HI, a, h));
    // Compute scaled complementary error function for argument "head"
    t = my_erfcx (h);
    // Enhance accuracy if in negative half-plane, if result has not overflowed
    if ((a < -1.0)  && (t <= MAX_IEEE_DBL)) {
        // Compute slope: erfcx'(x) = 2x * erfcx(x) - 2/sqrt(pi)
        s = fma (h, t * SCALE_DOWN, -TWO_RSQRT_PI * SCALE_DOWN); // slope
        // Linearly interpolate result based on derivative and argument "tail"
        t = fma (s, -2.0 * SCALE_UP * l, t);
    }
    // Scale by sqrt(pi/2) for final result
    r = fma (SQRT_PIO2_HI, t, SQRT_PIO2_LO * t);
    return r;
}

The single-precision implementation looks almost identical, except for the constants involved:

/* Compute Mills ratio of the standard normal distribution:
 *
 * M(x) = normcdf(-x)/normpdf(x) = sqrt(pi/2) * erfcx(x/sqrt(2)) 
 *
 * maximum ulp error in positive half-plane: 2.41987
 * maximum ulp error in negative half-plane: 3.39521
 */ 
float my_mills_ratio_f (float a)
{
    float h, l, r, s, t;

    const float SQRT_HALF_HI = 0x1.6a09e6p-01f; // sqrt(1/2), msbs
    const float SQRT_HALF_LO = 0x1.9fcef4p-27f; // sqrt(1/2), lsbs
    const float SQRT_PIO2_HI = 0x1.40d930p+00f; // sqrt(pi/2), msbs
    const float SQRT_PIO2_LO = 0x1.ff6270p-24f; // sqrt(pi/2), lsbs
    const float TWO_RSQRT_PI = 0x1.20dd76p+00f; // 2/sqrt(pi)
    const float MAX_IEEE_FLT = 0x1.fffffep+127f;
    const float SCALE_DOWN = 0.0625f; // prevent ovrfl in intermed. computation
    const float SCALE_UP = 1.0f / SCALE_DOWN;

    // Compute argument a/sqrt(2) as a head-tail pair of floats h:l
    h = fmaf (SQRT_HALF_HI, a, SQRT_HALF_LO * a);
    l = fmaf (-SQRT_HALF_LO, a, fmaf (-SQRT_HALF_HI, a, h));
    // Compute scaled complementary error function for argument "head"
    t = my_erfcxf (h);
    // Enhance accuracy if in negative half-plane, if result has not overflowed
    if ((a < -1.0f)  && (t <= MAX_IEEE_FLT)) {
        // Compute slope: erfcx'(x) = 2x * erfcx(x) - 2/sqrt(pi)
        s = fmaf (h, t * SCALE_DOWN, -TWO_RSQRT_PI * SCALE_DOWN);
        // Linearly interpolate result based on derivative and argument "tail"
        t = fmaf (s, -2.0f * SCALE_UP * l, t);
    }
    // Scale by sqrt(pi/2) for final result
    r = fmaf (SQRT_PIO2_HI, t, SQRT_PIO2_LO * t);
    return r;
}

Minor: Unclear about the magic number `0x1.fffffffffffffp+1023` Why not `DBL_MAX`? — chux - Reinstate Monica, Oct 18 '16 at 16:55
Why does `SCALE_DOWN * SCALE_UP` --> 2.0 rather than 1.0? Looks like code is hiding a `2.0*` in `fma (s, -SCALE_UP * l, t)`. Suggest `SCALE_DOWN * SCALE_UP` --> 1.0 and use `fma (s, (-2.0 * SCALE_UP) * l, t)`. — chux - Reinstate Monica, Oct 18 '16 at 17:00
@chux Personal idiosynchrasy as far as the first item: I am more familiar with the bit patterns of particualr IEEE-754 numbers than the macros from `float.h`, nor have I trusted the latter in the past. I concur on the second item, that was a bit of unnecessary obfuscation to pull the factor of `2.0` into `SCALE_UP`. — njuffa, Oct 18 '16 at 17:21
I'll look more into the computational aspects later, but what Ive seen looks good. — chux - Reinstate Monica, Oct 18 '16 at 17:49

Efficient and accurate computation of the Mills ratio of the standard normal distribution

1 Answers1